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Preface 



SAC 2000 was the seventh in a series of annual workshops on Selected Areas in 
Cryptography. Previous workshops were held at Queen’s University in Kingston 
(1994, 1996, 1998, and 1999) and at Carleton University in Ottawa (1995 and 
1997). The intent of the workshops is to provide a relaxed atmosphere in which 
researchers in cryptography can present and discuss new work on selected areas 
of current interest. 

The themes for the SAC 2000 workshop were: 

— design and analysis of symmetric key cryptosystems, 

— primitives for private key cryptography, including block and stream ciphers, 
hash functions, and MACs, 

— efficient implementations of cryptographic systems in public and private key 
cryptography, 

— cryptographic solutions for web/internet security. 

A total of 41 papers were submitted to SAC 2000, one of which was subsequently 
withdrawn. After a review process that had all papers reviewed by at least 3 
referees, 24 papers were accepted for presentation at the workshop. As well, we 
were fortunate to have the following two invited speakers at SAC 2000: 

— M. Bellare, UCSD (U.S.A.) 

“The Provable-Security Approach to Authenticated Session-Key Exchange” 

— D. Boneh, Stanford U. (U.S.A.) 

“Message Authentication in a Multicast Environment” 

The program committee for SAC 2000 consisted of the following members: 
L. Chen, H. Keys, L. Knudsen, S. Moriai, L. O’Connor, D. Stinson, S. Tavares, 
S. Vaudenay, A. Youssef, and R. Zuccherato. Many thanks are due to the pro- 
gram committee for their hard work. Also, Amr Youssef provided great assistance 
in making the reviewing process run smoothly. 

We are appreciative of the financial support provided by Certicom Corporation, 
CITO, Entrust Technologies, MITACS, and the University of Waterloo. Special 
thanks are due to Frances Hannigan, who was responsible for the local arrange- 
ments, and for making sure that everything ran smoothly during the workshop. 
Fran also assisted in preparing the workshop proceedings. Many people helped 
in the reviewing process by acting as sub-referees, and we appreciate all their 
help. Finally, we thank all the workshop participants for making SAC 2000 a 
success. 
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Abstract. The voice privacy of IS-95 CDMA cellular system is analyzed 
in this paper. By exploiting information redundancy on the downlink 
traffic channel, it is shown that an eavesdropper can recover the voice 
privacy mask after eavesdropping the transmission on the downlink traf- 
fic channel for about one second. Thus, IS-95 CDMA voice privacy is 
vulnerable under ciphertext-only attacks. 



1 Introduction 



IS-95 Code Division Multiple Access (CDMA) is an interim industry standard [3] 
for cellular communication systems. In IS-95 CDMA, a pseudo-random pattern, 
which is a high bit-rate binary sequence known as long code sequence, is added 
to the low bit-rate voice signal. Adding a high bit-rate noise-like signal to a voice 
signal makes the voice signal more robust and less susceptible to interference. 
It enables low-power transmission to take place, resulting in cheaper mobile 
stations with long-lasting battery life. The long code sequence, which is only 
known to the designated receiver, is also expected to provide certain level of 
privacy to the voice signal. To decode the voice signal, the eavesdropper has 
to recover the long code sequence from the intercepted signal. The long code 
sequence is generated by the long code generator as shown in Figure 3. The long 
code generator consists of a 42-bit number called long code mask and a 42-bit 
linear feedback shift register (LFSR) specified by the following characteristic 
polynomial: 



+ ^35 



.,33 



.,31 



x““ -I- a;“* -h -I- -I- -h x~~ + a;"* -h x 

-ba:^® + x^"^ + a:^® + x^® -b x^ -b x® -b x® -b x® -b x^ -b x -b 1. 
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The inner product of the LFSR state and the long code mask produces the long 
code sequence. 

Voice privacy of IS-95 CDMA is provided by means of the long code mask. 
The long code mask is not transmitted through any channel, it is constructed 
by the base station and the mobile station. To recover the long code sequence, 
the eavesdropper may exhaustively search the 42-bit long code mask, with a 
time complexity of 0(2^^). This attack is viable but is hard to implement in 
real time. Alternatively, it can be shown that the long code sequence can also 
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be recovered if the eavesdropper can obtain 42 bits of plaintext-ciphertext pairs. 
As there are many mobile stations transmitting simultaneously on the traffic 
channel and each mobile station only transmits approximately 3 minutes on 
the average, it is rather difficult to obtain 42 bits of the plaintext message. 
In this paper, we investigate the security of IS-95 CDMA voice privacy under 
ciphertext-only attacks. By exploiting information redundancy introduced by 
channel coding, it is shown that the eavesdropper can recover the voice privacy 
mask after eavesdropping the transmission on the traffic channel for about one 
second. This paper is organized as follows. Section 2 gives an equivalent long 
code generator which can simplify the analysis of CDMA voice privacy. Section 
3 describes two kinds of ciphertext-only attacks and Section 4 presents some 
concluding remarks. 

2 An Equivalent Long Code Generator 

Let (mi, TO 2 , ■•■'rn- 42 ) denote the 42-bit mask and (si(fc), S 2 (fc), S 42 (fc)) denote 

the state of the LFSR at time instant k. Then the long code sequence c(fc) at 
time instant k can be represented as 

c{k) = misi(fc) -I- m 2 S 2 {k) + h m42S42(fc), (1) 

where the addition is the modulo-2 addition. Since si{k), S 2 {k), ■ ■ ■ , Si 2 {k) are 
the outputs of the 42 stages of the LFSR, they are the same sequence but only 
differ in the phase. So, for any i, 1 < i < 42, Si{k) satisfies the following linear 
recurrence equation. 



Si{k) = aiSi{k - 1) -I- a2Si{k - 2) -I h a42Si{k - 42), (2) 

where Oi is the coefficient of a;* in the characteristic polynomial of the LFSR. 
Substituting (2) into (1), we have 
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42 

c(fc) = ^ miSi{k) 

i=l 

42 42 

i=l j=l 

42 42 

i=l i = l 



By equation (1), 

42 

= c{k-i). 
i=i 

Hence, it follows that 

c{k) = aic(/c — 1) + Q2c{k — 2) + • • • + 042c(fc — 42), (3) 

which means that the long code sequence is also generated by the same LFSR. 
Although every mobile station uses a different long code mask, their long code 
sequences are the same except for the different phases. The long code mask only 
affects the phase of the long code sequence. 

Let C(k) = (c(fc), c(fc — 1), • • • , c(fc — 41)) and C(fe + 1) = (c(fc + 1), c(fc), • • • , 
c{k — 42)) be two consecutive states of the LFSR. By equation (3), we have 



c{k + 1) ‘ 
c{k) 




ai 02 03 • 
1 0 0 
0 1 0 
0 0 1 


• 040 041 042 

0 0 0 

0 0 0 

0 0 0 




c(fc) 

c(fc — 1) 


c(fc-42)_ 




0 0 0- 
_ 0 0 0 • 


• 1 0 0 

• 0 1 0 _ 




1 

1 — 1 

1 

1 



Let A* denote the 42 x 42 matrix of the above equation and a* denote the 
vector (ai, 02, • • • , 042), where A* is the transpose of A. Then for any n > k, 

C{n) = C{k)A^-\ 

By equation (4), it follows that 

c(n) = C{k)A^-'^-^a. (5) 

Equation (5) indicates that the state C{k) of the LFSR can be calculated if 
any 42 bits of the long code sequence are known. Since the LFSR generating 
the long code sequence is publicly known, the coefficients 0^,1 < i < 42 are 
available to the eavesdropper. Thus, the long code sequence can be generated if 
the eavesdropper can recover 42 bits of the long code sequence. 
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3 Attacks on CDMA Voice Privacy 

Before the analysis of IS-95 CDMA voice privacy, let us give a simple description 
of CDMA channels. The CDMA channels consists of traffic channels and control 
channels. The traffic channels carry user information while the control channels 
carry signalling information. The traffic channels can be further divided into 
downlink and uplink traffic channels. 

3.1 CDMA Traffic Channels 

The downlink channel carries information from the base station to the mobile 
station. On the downlink traffic channel, the vocoder accepts the voice signal 
and produces a compressed data stream. IS-95 CDMA specifies a variable-rate 
vocoder operating at full, | | , and | rates. The rate is determined according 
to the power level of the background noise. There are currently two types of 
vocoders: the one operating in a 9.6-kbps, and the other operating in a 14.4- 
kbps, referred to as rate set 1 and rate set 2, respectively. Rate set 1 contains 
four elements: 9.6, 4.8, 2.4, and 1.2 kbps. Rate set 2 also contains four elements: 
14.4, 7.2, 3.6, and 1.8 kbps. The mobile station has to support rate set 1, while 
rate set 2 is optional. 

The data stream from the vocoder is structured in 20-ms frames. The full rate 
of rate set 1 vocoder is 8.6 kbps, which generates 172 bits every 20 ms. The frame 
quality indicator, which is cyclic redundancy checking (CRC) digits derived from 
the 172 information bits, is added to the 172 bits along with an 8-bit tail (set to 
0). The 9.6 kbps frame is a result of 192 bits (172-1-12-1-8) transmitted every 20 
ms. The 4.8-kbps frame has the same structure, while 2.4- and 1.2-kbps frames 
do not have frame quality indicator fields since most of the information sent in 
these frames is background noise. Rate set 2 frames have similar structures as 
rate set 1. The 20-ms data frames are encoded, interleaved, scrambled, spread, 
and modulated before they are sent onto the air interface. Figure 2 shows the 
different functions which act on rate set 1 data frames. 




Fig. 2. Rate set 1 downlink traffic channel generation 



The convolutional encoder provides error-correction capability to the down- 
link traffic channels. The convolutional encoder replace each single input bit with 
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two bits (called code symbols). The symbol repeater repeats the code symbols 
produced by the convolutional encoder as necessary to result in an output stream 
with the fixed rate 19.2 ksps (kilo-symbols per second). For example, to achieve 
this, it does not repeat anything if the input rate is 19.2 ksps. It repeats each 
symbol twice if the input rate is 9.6 ksps, it repeats each symbol 4 times if the in- 
put rate is 4.8 ksps. The block Interleaver shuffles the code symbols in each data 
frame. Data scrambling is provided through the long code generator. A power 
control subchannel is continuously transmitted on the downlink traffic channel. 
It is used to control the mobile station’s power on the uplink. This subchannel 
transmits at a rate of one bit every 1.25 ms (i.e., 800 bps). The power control 
bit which is two symbols long replaces two consecutive symbols on the downlink 
traffic channel. The 19.2 ksps data stream, which has been punctured with the 
power control bits of the power control subchannel, is spread by a Walsh code. 
Following the Walsh code spreading, the data is modulated for transmission. 

Like the downlink traffic channel, the uplink traffic channel also supports 
two rate sets, depending on the type of vocoder used. Figure 3 shows the overall 
structure of the uplink traffic channel for rate set 1. 




IPN 




QPN 



Fig. 3. Rate set 1 uplink traffic channel generation 



3.2 Information Redundancy on the Downlink TrafRc Channel 

On the downlink traffic channel, information bits from the vocoder are coded by 
a convolution encoder as shown in Figure 4. The convolutional encoder has 1-bit 
input, 2-bit output, and 8-bit memory. Initially the 8 memory bits are filled with 
all Os. For every bit of input, the convolutional encoder outputs 2 bits. In coding 
theory, each bit of output is called a code symbol. Such a convolutional encoder 
is called a half-rate convolutional encoder [1]. Because of the 8-bit memory, each 
code symbol is related to 9 information bits. The number 9 is called the constraint 
length of the convolutional encoder. The half-rate convolutional encoder with 
constraint length of 9 is also called a (2, 1, 8) convolutional encoder. 

Let b = {bQ,bi,b 2 , . ■ ■) denote the input sequence entering the convolutional 
encoder. The two output sequences 
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Fig. 4. Half-rate convolutional encoder 



and 



;(2) = 



can be computed as follows: 



= bi + bi-i + bi-2 + bi-3 + &i_5 + &/_7 + bi-s, 
^ = b[ + bi_2 + bi_3 + bi_4 + bi_3. 



(6) 

( 7 ) 



where = 0 for all I < i. The two output sequences are multiplexed into a 
single sequence, called the code word. Let v = {vq^\ v[^\ v^\ v^\ . . .) 

denote the code word. From (6) and (7), the code word satisfies the following 
equation: 

vH = 0, (8) 

where H is a, semi-infinite matrix given by 



H = 



101110001 

111101011 

101110001 

111101011 



1 0 ••• 

11... 



In equation (8), the code word is treated as a semi-infinite sequence since 
the input sequence to the convolutional encoder may be semi-infinite. When the 
input sequence to the convolutional encoder is truncated to k bits and the last 
8 bits are all zero, the code symbol sequence has length n = 2k and satisfies the 
following equation: 

y[H]n = 0. 

where [H]^ is composed of the first n rows and n/2 -|- 8 columns of H. 



(9) 
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Recall that the information bits entering the convolutional encoder are struc- 
tured in 20-ms frames and the last 8 bits of each frame are set to zero. As a 
consequence, the code symbols in every 20-ms frame satisfy equation (9). Thus, 
every frame on the downlink traffic channel contains redundant information. The 
redundant information can be used to solve for the long code sequence. 

Except the convolutional encoding, symbol repetition also incurs redundant 
information. The code symbols from the convolutional encoder are repeated 
whenever the information rate is lower than 9.6 kbps. Every code symbol at 
the 4.8 kbps rate is repeated 1 time, every code symbol at the 2.4 kbps rate is 
repeated 3 times, and every code symbol at the 1.2 kbps rate is repeated 7 times. 
Every frame contains 384 code symbols after repetition. Let u = {ui,U 2 , ■ ■ ■ , 
M384) denote the frame after code symbol repetition. If each code symbol appears 
two times in u, it is obvious that 



UE 2 = 0, 



( 10 ) 



where E 2 is a matrix described by 



E 2 



100---0 
100---0 
0 10 0 

0 10 0 



OOO--- 1 
OOO--- 1 



If we sample every other code symbol in u = (rti, U2, • • • , U384), the sampled code 
symbols (ui, U3, • • • , M383) should satisfy equation (9) since (mi, W 3, • • • , U383) are 
the output symbols of the convolutional encoder. Mathematically, the sampling 
of every other symbols of u = (ui, M2, • • • , W384) can be described by the matrix 
multiplication UD 2 , where 



D 2 



1 0 0 •• • 0 
0 0 0 0 
0 10 0 



0 0 0 1 
0 0 0---0 



Substituting v in (9) by UD 2 , we have 



uD2[H]^g2 — 0 . 



( 11 ) 



Generally, if each code symbol in m = (ui, M2, • • • , M384) appears k times, where 
k = 1,2,4, 8, then the code symbol frame u = (mi, M2, • • • , M384) satisfies the 
following equations: 



uEk = 0, 

w71fe[i7]384/^ = 0 



( 12 ) 

(13) 
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where E\ = 0, = I, D^, Eg,, and Dg can be derived in similar ways as E 2 

and D 2 - 



3.3 Recovery of the Long Code Sequence 

To recover the long code sequence, the eavesdropper intercepts the downlink traf- 
fic channel, demodulates the intercepted data frames, and despreads the data 
frames using the Walsh code specific to the channel. Let r = (ri, r 2 , • • • , r 3 S 4 ) de- 
note the data frame after despreading. Correspondingly, let c = (c(ti), c(t 2 ), ■ ■ ■ , 
c(^ 384 )) denote the 384 bits of the long code sequence used for scrambling, s = 
(si, S 2 , ■ ■ ■ , S 384 ) denote the output of the block interleaver, and u = (ui,U 2 , • • • , 
M 384 ) denote the input to the block interleaver. Figure 5 describes the relation- 
ship among these notations on the downlink traffic channel. 



Power 
Control Bit 




Fig. 5 . Block interleaving and scrambling 



The block interleaver is a permutation of the 384 input code symbols. The 
input symbols are entered as a 24 x 16 array and the interleaver produces a 
24 X 16 output array. Table 1 describes the input array. The table is read down 
by columns from the left to the right. That is, the first input symbol ui is at the 
top left, the second input symbol U 2 is just below the first input symbol, and the 
25th input symbol M 25 is just to the right of the first input symbol. The output 
array is given by Table 2, which is read the same way as Table 1, that is, the 
first output symbol is iti, the second output symbol is ugs, and the 25th output 
symbol is ug. 

Mathematically, the block interleaver can be represented by a permutation 
matrix P, namely, 

s = uP, (14) 

where P is a 384 x 384 matrix, each row and column has only one 1. 

For the moment, let’s assume that the power control bits were not transmit- 
ted through the downlink traffic channel. Then we have 



r = s 0 c. 



(15) 



1 

65 

129 

193 

257 

321 

33 

97 

161 

225 

289 

353 

17 

81 

145 

209 

273 

i 337 

i 49 

i 113 

i 177 

i 241 

i 305 

i 369 
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Table 1. Downlink traffic channel interleaver input 



1 


25 


49 


73 


97 


121 


145 


169 


193 


217 


241 


165 


289 


313 




337 


2 


26 


50 


74 


98 


122 


246 


170 


194 


218 


242 


266 


290 


314 




338 I 


3 


27 


51 


75 


99 


123 


147 


171 


195 


219 


243 


267 


291 


315 




339 


4 


28 


52 


76 


100 


124 


148 


172 


196 


220 


244 


268 


292 


316 




340 I 


5 


29 


53 


77 


101 


125 


149 


173 


197 


221 


245 


269 


293 


317 




341 


6 


30 


54 


78 


102 


126 


150 


174 


198 


222 


246 


270 


294 


318 




342 


7 


31 


55 


79 


103 


127 


151 


175 


199 


223 


247 


271 


295 


319 




343 


8 


32 


56 


80 


104 


128 


152 


176 


200 


224 


248 


272 


296 


320 




344 : 


9 


33 


57 


81 


105 


129 


153 


177 


201 


225 


249 


273 


297 


321 


345 


10 


34 


58 


81 


106 


130 


154 


178 


202 


226 


250 


274 


298 


322 


346 1 


11 


35 


59 


83 


107 


131 


155 


179 


203 


227 


251 


275 


299 


323 


347 


12 


36 


60 


84 


108 


132 


156 


180 


204 


228 


252 


276 


300 


324 


348 


13 


37 


61 


85 


109 


133 


157 


181 


205 


229 


253 


277 


301 


325 


349 


14 


38 


62 


86 


110 


134 


158 


182 


206 


230 


254 


278 


302 


326 


350 


15 


39 


63 


87 


111 


135 


159 


183 


207 


231 


255 


279 


303 


327 


351 


16 


40 


64 


88 


112 


136 


160 


184 


208 


232 


256 


280 


304 


328 


352 


17 


41 


65 


89 


113 


137 


161 


185 


209 


233 


257 


281 


305 


329 


353 


18 


42 


66 


90 


114 


138 


162 


186 


210 


234 


258 


282 


306 


330 


354 


19 


43 


67 


91 


115 


139 


163 


187 


211 


235 


259 


283 


307 


331 


355 


20 


44 


68 


92 


116 


140 


164 


188 


212 


236 


260 


284 


308 


332 


356 


21 


45 


69 


93 


117 


141 


165 


189 


213 


237 


261 


285 


309 


333 


357 


22 


46 


70 


94 


118 


142 


166 


190 


214 


238 


262 


286 


310 


334 


358 


23 


47 


71 


95 


119 


143 


167 


191 


215 


239 


263 


287 


311 


335 


359 


24 


48 


72 


96 


120 


144 


168 


192 


216 


240 


264 


288 


312 


336 


360 



361 

362 

363 

364 

365 

366 

367 

368 

369 

370 

371 

372 

373 

374 

375 

376 

377 

378 

379 

380 

381 

382 

383 
3.84 



Table 2. Downlink traffic channel interleaver output 
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13 
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15 


2 


10 


6 


14 


4 


12 


8 


16 


73 


69 


77 


67 


75 


71 


79 


66 


74 


70 


78 


68 


76 


72 


80 


137 


133 


141 


131 


139 


135 


143 


130 


138 


134 


142 


132 


140 


136 


144 


201 


197 


205 


195 
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207 


194 


202 


198 


206 


196 


204 


200 


208 
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261 


269 


259 


267 


263 


271 


258 


266 


262 


270 


260 


268 


264 


272 


329 


325 


333 


323 


331 


327 


335 


322 


330 


326 


334 


324 


332 


328 


336 


41 


37 


45 


35 


43 


39 


47 


34 


42 


38 


46 


36 


44 


40 


48 


105 


101 


109 


99 


107 


103 


111 


98 


106 


102 


no 


100 


108 
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173 
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171 


167 


175 


162 


170 


166 


174 


164 


172 


168 


176 


233 


229 
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227 
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231 


239 


226 


234 


230 


238 


228 
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232 


240 
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293 


301 


291 


299 
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290 


298 


294 
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292 


300 


296 
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365 
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359 
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25 


21 


29 
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23 
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26 
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24 
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89 


85 


93 
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91 


87 


95 


82 


90 


86 


94 


84 


92 


88 


96 


153 


149 


157 


147 


155 


151 


159 


146 


154 


150 


158 


148 


156 


152 


160 
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213 


221 


211 


219 


215 
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210 


218 


214 


222 


212 
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216 


224 


281 
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285 
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283 


279 


287 


274 


282 
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286 


276 


284 


280 


288 


345 


341 


349 


339 


347 


343 


351 


338 


346 


342 


350 


340 


348 


344 


352 


57 


53 


61 


51 


59 


55 


63 


50 


58 


54 


62 


52 


60 


56 


64 


121 


117 


125 


115 


123 


119 


127 


114 


122 


118 


126 


116 


124 


120 


128 


185 


181 


189 


179 


187 


183 


191 


178 


186 


182 


190 


180 


188 


184 


192 


249 


245 


253 


243 


251 


247 


255 


242 


250 


246 


254 


244 


252 


248 


256 


313 


309 


317 


307 


315 


311 


319 


306 


314 


310 


318 


308 


316 


312 


320 


377 


373 


381 


371 


379 


375 


383 


370 


378 


374 


382 


372 


380 


376 


384 
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From (14) and (15) 



u = 



[r®c)P 



( 16 ) 



Substituting (16) into (12) and (13), 



(r © c)P ^Ek = 0, 
(r©c)P = 0 



(17) 

(18) 



where k is equal to the number of times each code symbol appears in the frame. 

Let C{to) = (c(to), c(to — 1), • • • , c(to — 41)) denote the state of the LFSR that 
the eavesdropper would like to solve for. By (5), c = (c(ti), c{t 2 ), • • • , 0(^334)) can 
be represented as 



where A = ^ ^ ^ A*3S4-*o-i)a. 

Substituting (19) into (17) and (18), we have the following equations with 
c(to), c(to — 1), • • • I c(to — 41) as variables. 



Corresponding to every k, k = 1, 2, 4, 8, there are 200 or more linear equations 
involved in (20) and (21). Solving these linear equations, the state of the LFSR 
can be recovered. 

Now, taking into account of the power control bits, many linear equations 
in (20) and (21) will no longer hold because of the corruption of the power 
control bits. The power control subchannel transmits at a rate of one bit every 
1.25 ms. Using the puncturing technique, each power control bit which is two 
symbols long replaces two consecutive downlink traffic channel code symbols. 
Since the code symbols have a rate of 19.2 ksps, there will be one power control 
bit transmitted within every 24 code symbols. There are 16 possible starting 
positions for the power control bit. Each position corresponds to one of the first 
16 code symbols within a 1.25 ms period. In each 1.25 ms period, a total of 
24 bits from the long code sequence are used for scrambling. These bits are 
numbered 0 through 23. The 4-bit binary number with values 0 through 15, 
formed from scrambling bits 23,22,21, and 20, are used to determine the position 
of the power control bit. Hence, within every 24 code symbols, the last 7 code 
symbols are not affected by the power control bits. These imcorrupted code 
symbols include rig_|_24i, rig+24 i, . . . , r24+24i, i = 0, 1, • • • , 15. Table 2 outlines the 
interleaver output uncorrupted by the power control bits. Those symbols in the 
dotted boxes in Table 2 can be recovered reliably from r and c. Correspondingly, 
those symbols in the dotted boxes in Table 1 can be recovered reliably from r and 
c. Code symbols in the dotted boxes in Table 1 can be divided into 7 groups. 
Each group contains 16 consecutive code symbols. The 7 groups are listed as 
follows: 



c=C{to)A. 



(19) 



(r(BC{to)A)p-^Ek = 0, 
(r©C(to)i)R-'7?fc[77]384/fc = 0. 



( 20 ) 

( 21 ) 
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49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 

113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 

177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 

241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 

305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 

337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 

369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 

In the following, we will extract those linear equations in (20) and (21) that 
are only related to the uncorrupted code symbols in r, such linear equations are 
called reliable linear equations. 

(1) When the data rate is 19.2 ksps, we can not obtain any linear equations 
from (20) since code symbols are not repeated. In (21), there are 8 reliable 
linear equations. These reliable linear equations are related with code symbols 
in the 7th group, that is, 3359 through 5334. Since the constraint length of the 
(2, 1, 8) convolutional encoder is 9, except for the first and the last 8 linear 
equations, other linear equations in (21) are related to 18 code symbols. From 
the structure of H, it is easy to find that the first 8 linear equations are related 
to the first 2, 4, 6, 8, 10, 12, 14, and 16 code symbols respectively. Similarly, the 
last 8 linear equations are related to the last 2, 4, 6, 8, 10, 12, 14, and 16 code 
symbols respectively. 

(2) When the data rate is 9.6 ksps, there are 56 reliable linear equations in (20), 
which are related to the code symbols in the 7 groups. There are 4 reliable linear 
equations in (21), which are related to the code symbols in the 7th group. Hence, 
there are a total of 60 reliable linear equations in (20) and (21). 

(3) When the data rate is 4.8 ksps, there are 86 reliable linear equations in (20) 
and 2 reliable linear equations in (21), with a total of 86 reliable linear equations 
in (20) and (21). 

(4) When the data rate is 2.4 ksps, there are 98 reliable linear equations in (20) 
and 1 reliable linear equations in (21), with a total of 99 reliable linear equations 
in (20) and (21). 

Hence, when the data rate is not 19.2 ksps, an eavesdropper can derive 60 
or more reliable linear equations from one data frame. The eavesdropper can 
solve the reliable linear equations corresponding to the three data rates, 9.6 
ksps, 4.8 ksps, and 2.4 ksps. If there are no solutions corresponding to any of 
the three data rates. The eavesdropper determines that the data rate should be 
19.2 ksps. If there is a solution corresponding to one of the three data dates. The 
long code sequence can be computed and the positions of the power control bits 
can be determined. Also the corruption caused by the power control bits can 
be removed by the convolution code. The intercepted code symbols are tested 
against equations (17) and (18). If the test fails, the solution is incorrect. If 
there is only one solution passing the test, then long code sequence is found out. 
If more than one solution pass the tests, then use another frame to test every 
solution, this process continues until a unique solution is found out. 
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3.4 A Robust Attack 

In the above attack, the eavesdropper needs to solve 4 sets of linear equations 
corresponding to the 4 data rates, 19.2 ksps, 9.6 ksps, 4.8 ksps, and 2.4 ksps. 
In the situation when more than one set of linear equations have solutions, the 
eavesdropper has to determine which solution is correct. This process can be 
avoided if the eavesdropper can construct a set of linear equations that hold for 
the 4 data rates. In the following, we will investigate reliable linear equations in 
(20) and (21) that hold for the 4 data rates. 

When the data rate is not 19.2 ksps, the code symbols in u are repeated at 
least one time. As a result U 2 i and tt 2 i+i, 0 < i < 192, must be equal, i.e., 

U2i 0 U2i+i = 0, 0 < i < 192. (22) 

When the data rate is 19.2 ksps, the code symbols in u are not repeated. But, 
from (13), we can get the following equation 



^383 0 ^^384 — 0- (23) 

By (22), it can be concluded that (23) holds for the 4 data rates. From Table 2, 



^383 — ■S192, 

^384 = S384- 

Since rig 2 and T 384 are not corrupted by the power control bits, by (15), 

“383 = ?'192 0 c(ti92), 

“384 = r384 0 c(t384). 



Hence, it follows that 



c(tl 92 ) 0 c(t 384 ) = ri 92 0 T 384 . (24) 

Therefore, from every intercepted frame, the eavesdropper can always construct 
a reliable linear equation which is independent of the date rate associated with 
the frame. With 42 intercepted frames, the eavesdropper can construct a set of 
42 reliable linear equations to solve for the long code sequence. Each frame lasts 
for 20 ms, 42 frames last for 840 ms, which is less than 1 second. Hence, by 
eavesdropping the downlink traffic channel 1 second, the eavesdropper can get 
enough information to recover the long code sequence. After recovering the long 
code sequence, the eavesdropper can despread the uplink traffic channel. 

4 Concluding Remarks 

The analysis of this paper demonstrates that IS-95 CDMA provides a lowwer 
than expected level of voice privacy. By eavesdropping the transmission on the 
downlink traffic channel for one second, an eavesdropper can recover the long 
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code sequence used for voice privacy. The vulnerability of the voice privacy may 
have effect on the security of the authentication process since the long code mask 
is generated during the authentication process. As details of the authentication 
protocol are not publicly available due to export restriction, we will leave this 
problem for future research. 

There are several reasons for the weak voice privacy. First, channel coding 
and symbol repetition leak information about the long code sequence, which is 
demonstrated by (12) and (13). Channel coding provides error correction capa- 
bility by introducing redundancy in the transmitted information, while ciphers 
remove redundancy in the information. When error-correcting codes and ciphers 
are used in the same channel, the effects of error-correcting codes on ciphers 
should be carefully examined. Second, the block interleaver does not distribute 
power control bits in a frame uniformly. From Table 1, there are many con- 
secutive code symbols which are not affected by the power control bits. Third, 
most importantly, the long code generator is not a good cipher. From (3), the 
long code sequence is a linear feedback shift register sequence. Linear feedback 
shift register sequence may be good for spread spectrum purpose but not good 
for cryptographic purpose. It had better design the two sequences separately, 
one used for spreading, and the other for voice privacy. It is difficult to design 
a sequence for both spread spectrum and voice privacy. There are different re- 
quirements for the two applications. Hence, a new cipher that provides strong 
voice privacy is required. 

Last, we would like to emphasis that attacks described in this paper have not 
gone through any field test and we have no intention of performing any test in 
the future. To achieve a high level of voice privacy, the Telecommunication In- 
dustry Association (TIA) TR-45 has created a process for developing enhanced 
encryption algorithms for the next generation CDMA systems. The enhance- 
ment will use 128-bit private keys. Several algorithms, including an algorithm 
designed by the authors of the paper, have been submitted to TR-45 Ad Hoc 
Authentication Group (AHAG) for adoption of encryption standard for the next 
generation CDMA systems. 
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Abstract. We present and analyze attacks on additive stream ciphers 
that rely on linear equations that hold with non-trivial probability in 
plaintexts that are encrypted using distinct keys. These attacks extend 
Biham’s key collision attack and Heilman’s time memory tradeoff attack, 
and can be applied to any additive stream cipher. We define linear re- 
dundancy to characterize the vulnerability of a plaintext source to these 
attacks. 

We show that an additive stream cipher with an n-bit key has an effective 
key size of n — min(Z, Ig M) against the key collision attack, and of 2n/?>-\- 
lg(n/3) + max(n — I, 0) against the time memory tradeoff attack, when 
the the attacker knows I linear equations over the plaintext and has M 
ciphertexts encrypted with M distinct unknown secret keys. 

Lastly, we analyze the IP, TCP, and UDP protocols and some typical 
protocol constructs, and show that they contain significant linear redun- 
dancy. We conclude with observations on the use of stream ciphers for 
Internet security. 



1 Introduction 

Biham’s key collision (KC) attack [5] and Heilman’s time-memory tradeoff 
(TMTO) attack [11] can be adapted to attack additive encryption in the case 
that many ciphertexts encrypted with distinct keys, whose corresponding plain- 
texts all obey some known linear relations, are available to the cryptanalyst. 
Both of these methods use a precomputation stage in which some knowledge of 
the typical plaintext is used to build a database, followed by an attack stage in 
which (hopefully many) ciphertexts are analyzed in an attempt to find unknown 
keys. The computational cost of the precomputation stage can be amortized over 
many runs of the attack stage, significantly reducing the effective key size of the 
cipher against these attacks. These attacks rely on the fact that there are linear 
equations in the plaintext bits that are known to the cryptanalyst. We define 
the linear redundancy of a plaintext source as the to capture this property. 

A linearly redundant source may involve linear equations that hold with 
probabilities that are not close to unity. We present and analyze an adaptation 
of the KC attack that works in such cases by using error correcting codes. 



D.R. Stinson and S. Tavares (Eds.): SAC 2000, LNCS 2012, pp. 14—28, 2001. 
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We analyze the linear redundancy of the Internet Protocol (IP), the Trans- 
mission Control Protocol (TCP), and the User Data Protocol (UDP) traffic with 
stream ciphers. IP is used by the Internet to transport packets between networks 
[I], while TCP and UDP are the most common higher-level protocols transported 
by IP. IP, TCP, and UDP packets are known to contain a significant amount 
of data that is guessable by an adversary (as was pointed out by Bellovin [4]). 
Our analysis extends these observations by showing that these packets contain 
a large amount of linear redundancy that can be used in cryptanalytic attacks. 

The Stream Cipher ESP (SC-ESP) is a specification for the use of those 
ciphers to provide privacy within the IPSEC framework [13,9]. It describes how 
to use additive stream ciphers for the encryption of IP packets (if used in tunnel 
mode) as well as TCP, UDP, or other packets (if transport mode)^. Below, we 
derive requirements on SC-ESP that provide protection against the attacks that 
we develop in this paper. We do not investigate the linear redundancy of other 
important Internet protocols, such as HTTP or RTP, though such protocols are 
commonly used with additive encryption in the SSL, TLS, and SSH protocols. 
However, the techniques that we develop in this paper do apply to their analysis, 
and we expect that these protocols also contain a significant amount of linear 
redundancy. 

The rest of this paper is organized as follows. Section 1.1 introduces our 
terminology and assumptions. Section 2 introduces the idea of linear redun- 
dancy. Section 3 introduces the key collision attack, shows how it can be applied 
to attack additive encryption, and analyzes its computational cost and success 
probability, while Section 3.1 shows how that attack can be modified to deal 
with linear equations that are probabilistic, rather than deterministic. Section 4 
adapts Heilman’s time-memory tradeoff to attack additive encryption, and ana- 
lyzes the resulting algorithm. The IP, TCP, and UDP protocols are analyzed in 
Section 5, and are shown to contain enough linear redundancy to enable the suc- 
cessful prosecution of the attacks that we derive. Our conclusions are presented 
in Section 6. 

1.1 Terminology and Assumptions 

An additive stream cipher is a cipher that encrypts a plaintext by bitwise adding 
it (modulo two) to a keystream. The keystream is generated pseudorandomly, 
given a secret key. Mathematically, 

Ci=Pi®s^{k), ( 1 ) 

where Ci,pi and Si{k) are the bit of the ciphertext, plaintext, and the key- 
stream corresponding to the key k. Additive stream ciphers can be defined over 
any group, and our results can easily be generalized, but below we consider only 
binary addition for clarity of exposition. 

Modern stream ciphers include RC4, SEAL, the Output Feedback (OFB) 
mode specified by NIST for use with the DES [18] and the counter mode for 

^ See [8,9] for a more detailed description of IPSEC and ESP 
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block ciphers [16, p.lOO]. RC4 is widely used to secure HTTP, as it is part of the 
Secure Sockets Layer (SSL) and Transport Layer Security (TLS) specifications. 
Other stream ciphers in use include the recently broken A5/1 [2] used in GSM 
cellular phones, and the cipher EO in the Bluetooth specification for Wireless 
LAN Security [3]. 

We make the conventional assumption that the cryptanalyst can check if a 
key is correct by trial decryption of a ciphertext followed by a redundancy check 
of the decrypted plaintext. We also assume that ciphertexts are distributed uni- 
formly at random, which is essentially equivalent to assuming that the cipher is 
indistinguishable from a truly random source. We also assume that the cryptan- 
alyst has access to many ciphertexts encrypted under many distinct keys, whose 
corresponding plaintexts originate from a random but redundant source whose 
mathematical characterization is known to the attacker. We make the implicit 
assumption that the unknown keys are distinct, which is a good assumption when 
the number of unknown keys is less than the square root of the total number of 
keys, from the ‘birthday paradox’. 

2 Linear Redundancy 

We generalize the idea of known or guessable plaintext attacks by considering 
attacks on a large number of ciphertexts whose plaintexts were all generated by 
the same source. We use the information theoretic idea of a plaintext source as 
a generator of binary strings that chooses strings by a random process that can 
be characterized by a probability distribution. A source is redundant when its 
probability distribution is not uniform. 

To attack an additive cipher, we consider linear equations in terms of the 
plaintext bits. From Equation 1, it follows that 



Ci © Cj = {pi ©Pj) © (si{k) © Sj{k)). (2) 

If Pi (B Pj is zero (respectively, one), then Ci © Cj will equal Si{k) © Sj{k) (re- 
spectively, will be its opposite). If the same property holds for a large number 
of plaintext bits, those bits can be used to identify a collision between a secret 
key set and a known key set. A single linear relation among the plaintext bits of 
all plaintexts from a source is equivalent to a single bit of known plaintext, for 
our purposes. 

If there are I linear relationships between the plaintext bits, this fact can be 
represented mathematically as 

W 

^ LijPi = 6j for all j : 1 < j < /, (3) 

i=l, 

where is an invertible mxw boolean matrix, and e is an / x 1 boolean vector. 

More generally. Equation 3 can hold with some probability not equal to one. 
The vector S = Lp © e, which would be the zero vector if the linear equations 
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held with probability one, has a low hamming weight. We define the set A as 
the set of typical (e.g., most probable) values of S, such that 

E 1 (4) 

SeA 

where P{S) is the probability that Lp(Be will have the value S, and e is a number 
less than one. We say that a plaintext source has linear redundancy (A, e) if there 
exists an L and e as defined above such that A = 1 — lg(#Z\)//. 

In the case that each of the linear equations hold with the same probability 
(p, then the expected weight of (5 = Ap © e is 4>l. In this case, the size of the set 
of typical vectors A is well approximated by X]f=o ii) — where h{<p) = 

—<p\g(j)— (1 — 4>) lg(l — (p) is the binary entropy function^. The linear redundancy 
then reduces to — 1/2). In the following, we focus on the practical attacks 

rather than the theoretical characterization of linear redundancy. 

Our attacks can be viewed as decoding unknown keys, and thus matching 
them to some set of known keys. From this viewpoint, there is a noisy com- 
munication channel from an unknown key to the attacker, where the ‘noise’ is 
a plaintext message. The unknown keys are the source words, the keystream 
segments are the code words, ciphertexts are the received words. The attacker 
faces the problem of decoding the received words to a known code word. We call 
this channel the cryptanalytic channel, and it is analogous to the one defined by 
Siegenthaler in the description of correlation attacks on combination generators 
[14]. The code used in our attacks is a set of keys that is randomly chosen by 
the attacker. Obviously, ciphertexts created with unknown keys that are not in 
the code cannot be properly decoded. Our attacks work by decoding correctly 
whenever possible, and rely on the ‘birthday paradox’ to ensure that there are 
keys common to both the random code and the set of unknown keys. 

In some cases, attacks using linear redundancy can be significantly improved 
through the use of traffic analysis, that is, the use of external information about 
the ciphertexts to establish the value of the vector e. In the case of Internet 
security, this information includes the length of the encrypted data, the time 
of creation of the encrypted data, and the position of each ciphertext in the 
sequence of all ciphertexts. 

3 Key Collision Attacks on Additive Encryption 

Key collision attacks [5] take advantage of the birthday paradox to reduce the 
expected work effort of finding secret keys. These attacks use two distinct sets 
of keys: a set of unknown secret keys, and a set of keys generated by the crypt- 
analyst. These sets will contain a common element with high probability when 
the product of the sizes of the sets is close to the size of the set of all keys. 

The known-plaintext key collision attack works as follows: the cryptanalyst 
encrypts the same fixed plaintext with N distinct keys, and stores the resulting 

^ This approximation uses the tail inequality [6], and is asymptotically exact 
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ciphertexts along with the keys that generated them. We call the set of cipher- 
texts the known key set. The cryptanalyst then gets a hold of M ciphertexts that 
are created by encrypting the same plaintext with distinct unknown keys, and 
looks for collisions, that is, elements with the same keys that are in both sets. 
When one of the unknown keys is equal to one of the known keys, a collision oc- 
curs. With an n-bit key, this will happen with high probability when MN > 2”. 
In practice, many keys may map the same plaintext to the same ciphertext, so 
the cryptanalyst must check each collision with a trial decryption. 

In order to attack additive encryption of linearly redundant plaintext, we 
define a hallmark of a key. This is a binary vector that captures enough infor- 
mation about the key to enable elements of the known key set to be matched to 
the unknown key set. 

Combining Equations 3 and 1 gives 

W W 

^ L^Si{k) = Cj © ^ Lijd for all j : 1 < j < /. (5) 

2 — 1 , ^— 1 , 

The known key hallmark v is defined by Vj{k) = LijSi{k). The un- 
known key hallmark u is defined by Uj = ej © Lijd. Both v and u are 

length I binary vectors. In the event that u = r>, it is (at least relatively) likely 
that the known key and unknown key are equal. 

We now show how to prosecute a KC attack on an additive cipher, given L 
and e. In the precomputation stage, compute the set V = {{v{k),k) ■. k G R} 
of known keys and their hallmarks , where i? is a set of N arbitrary distinct 
keys, and sort the vectors so that their first components are in non-decreasing 
order. In the attack stage, we are given the set C = {c} of ciphertexts, and we 
want to find as many of the unknown keys as possible. We denote the number 
of ciphertexts (and thus the number of unknown hallmarks ) as M. The attack 
algorithm follows: 

1. Compute the set of unknown keys and hallmarks U = {(Ac© e,c) : c € C}, 
and sort it into non-decreasing order. 

2. Find the join J = {(x, k, c) : {x, c) G U, (x, k) G V}, that is, the intersection 
of the first components of V and U. 

3. For each element (a:, k, c) G J, do a, trial decryption of the ciphertext c using 
the key k. 

The intersection of two sets of bit vectors can be found by sorting each 
set into non-decreasing order, maintaining a pointer into each set, repeatedly 
advancing the pointer that points to the smallest element, and outputting the 
elements when they match [10]. If radix sort [10] is used, then this algorithm is 
completely paralellizable. 

An important property of this attack is that the vector e need not be known 
during the precomputation stage; it is sufficient to know e during the attack 
stage. This property lends itself to practical attacks, as there are many cases in 
which the plaintext at two locations will be linearly related, though the exact 
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value of the relationship may only be predictable through traffic analysis or 
other external means. For example, traffic analysis of the IP protocol can readily 
discern many TCP/IP packets (TCP ACK packets have the distinctive length 
of 43 bytes), thus revealing the value of the ‘Protocol’ field of the IP packet (See 
Table 1). 

The basic key collision attack requires storage of order M + N. The precom- 
putation stage requires N encryptions and NlgN comparisons and copies (for 
sorting) . 

The attack stage requires MlgM comparisons and copies in the sorting stage. 
Finding the set J requires M + N comparisons. The attack performs # J trial 
decryptions, which is equal to the sum of the number of false hits, which is 
MN/2\ and the number of true hits, which is MiV/2”. The total computation 
is of of order MlgM + M + N + MN{l/2‘ + 1/2^). 

The expected number of true hits found, that is, the number of messages 
successfully decrypted, is MN/2'^. Thus, the order of the expected work w for 
each successful decryption is given by 



If 2"lgM/7V is the leading term in Equation ( 6 ), we say that the attack is 
sort limited. If 2"/M is the leading term, we say that the attack is intersection 
limited. This can happen when the size of the known key set is large and the 
size of the unknown key set is small. If 2”“* is the leading term, we say that 
the attack is information limited. This case happens when there are few linear 
equations in the plaintext. The term 1 can never be the leading term, as this 
implies that the known and unknown key sets are larger than the set of all keys. 
This term can be neglected in practice. The expected number of keys that are 
tried for a given unknown key hallmark is N j 2^ . When the attack is information 
limited, this number is large (on average). When then attack is sort limited or 
intersection limited, it is small (on average). 

To make the advantage over exhaustive search explicit, we introduce the 
effective key size, which we define to be the base-two logarithm of the order of 
the expected work. The effective key size is denoted as 77 , and is given by 



w = 



MlgM + M + N + MN{l/2^ + 1/2^) 
MN/2^ 



( 6 ) 




r] = Igw = n + lg 




I— n 



( 7 ) 



When the attack is information limited, then 

77 ~ n -I- Ig 2“* = n — 1. 



(8) 



This approximation is valid when I <C IgminM, A^. The linear relationship be- 
tween effective key size and I is shown in Figure 1. When the attack is intersection 
limited, then 
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Fig. 1. The effective key size as a function of the linear redundancy. In this figure, the 
fractional key size 77 /n is plotted versus the fractional number of linear equations l/n. 
The plot shows various values of M/2"; in every case, N = 2"^^ 



77 ~ n + Ig 





n — Ig min M, N. 



(9) 



When the attack is sort limited, then 

Ig M 

? 7 ~n + lg-^^ =n-lgA^ + lglgM. (10) 

We define the break even value Nh to be the value of N such that the effective 
key size is equal to the actual key size; when N is below this value, the attack 
is not effective. Solving for this value, we find that A/, = (1 + lgM)/(l — — 

2“" — 1/M). The effective key size as a function of N can be succinctly expressed 
as 



77 = n — Ig 



l + (l + lgM) 





( 11 ) 



The dependence of ij on N is demonstrated in Figure 2. The maximum possible 
value for Nf, is 2", which implies that a necessary condition for the above attack 
to provide an advantage over exhaustive search is that 




1 + lgM 



l>lg 



2 ” 



(12) 
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M = 2''(n/4) 
M = 2''(n/2) 
M = 2''(3n/4) 
M = 2''(7n/8) 




0.01 0.1 
Fractional size of known key set 



Fig. 2. The effective key size as a fnnction of the size of the unknown key set. In this 
figure, the fractional key size r]/n is plotted versus the fractional size of the known key 
set Af/2". In these plots, m = n 



3.1 Probabilistic Linear Equations 

The KC attack on additive encryption in the previous section can work even 
when the linear equations (5) do not always hold, but hold with some non- 
negligible probability. If the probability that all of the equations hold simulta- 
neously is p, then the effective key size is increased by Igl/p. However, better 
attacks can be realized by using error-correcting codes. Below we present a sim- 
ple adaptation of the KC attack that uses error correction of the hallmarks. 

The error-correcting KC attack differs from the KC attack presented above in 
the precomputation stage and in Step 1. The known key hallmarks are required 
to be codewords of an error-correcting code D which has codewords of length I, 
a total of 2^ codewords, and which can correct up to e errors. This property can 
be realized by using a rejection method during the precomputation stage, which 
will increase the amount of computation in that stage by a factor of about 2*“^. 

Step 1 of the attack algorithm is modified by changing the definition of the 
set U to U = {{d{Lc(B e),c) : c G C}, where d is a decoding function for the 
code D. 

The effective key size can be derived as with the information limited case 
above, with the differences that in the error correcting case the number of false 
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hits is now MN/2^ and the number of true hits is pMN/T^ The effective key 
size rjEC for the error-correcting KC attack is 

t]kc - n- k + lg- = n - IR + lg-, (13) 

P P 

where R = k/l is the rate of the code. There is a tension between R and the 
decoding error, in that increasing one tends to decrease the other. It is difficult to 
further characterize the effectiveness of this attack in the general case because 
of the variety and complexity of error correcting codes [12]. One example of 
a useful code is the n = 128, fc = 100, e = 4 code based on BCH codes [15]. 
Gallager codes [7], which have proved useful in correlation attacks on stream 
ciphers, may also prove useful in our attacks. It is also possible to use nonlinear 
codes, though such codes could require a significant storage space. 

A strict lower bound on the effective key size of the error correcting KC 
attack is provided by an information theoretic treatment of the cryptanalytic 
channel. The capacity C of that channel, which is determined by the plaintext 
source [6] , is the upper bound on the rate i? of a code that can be used in the 
attack, thus limiting that value in Equation (13). 

4 Heilman’s Time-Memory Tradeoff 

Heilman’s time-memory tradeoff (TMTO) is a method that can be used to dra- 
matically reduce the average amount of computation needed to invert a one-way 
function [11]. It works by precomputing a large table, then using the same table 
to attack many secret keys. Asymptotically, this attack can be used to break a 
block cipher with an n bit key with about 2^"/^ operations, using storage 

[11]. Below, we review how to invert a function S' : — >■ F^, then show how to 

adapt this result to attack additive encryption of linearly redundant plaintext. 

To perform the TMTO, given the function S to be inverted, select a reduction 
function i? : F 2 — >■ F 2 , the size of the table N^, and the tradeoff parameter t. 
The reduction function serves to map the range of S back onto its domain. In 
the precomputation stage, define the function f{x) = R{S{x)), and compute the 
set T = {(/*(x), a;) : x € R}, where i? is a random A^-element subset of F 2 , and 
sort the elements of T so that their first components are in increasing order. 

In the attack stage, to find z given y such that S(z) = y, compute the set 
Y = {{p{R{y)),i) : i = 0,1, . . . ,t — 1}, and sort its elements so that their first 
components are in increasing order. For each component (a, i) G Y such that 
(a, x) G T for some x, compute /*~*~^(x) and check if it is the proper inverse. 

The precomputation stage requires Nt evaluations of the function /, as well 
as Nig N operations for the sorting and storing N elements. The inversion stage 

^ This analysis assumes that the decoding function is equally likely to chose any code- 
word if the number of errors in the hallmark is greater than e, a property which 
holds for linear codes 

^ Heilman refers to this parameter as m in [11]. We use the notation N to be consistent 
with the terminology in the Key Collision section 
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requires t evaluations of /, sorting and storing t elements, and N + t operations 
to find the intersection of a t element set and an N element set. 

In the TMTO attack against block ciphers, the work effort due to false hits 
is negligible. R can be chosen so that it does not collide, so that a collision of 
S implies a collision of the the underlying function /. If / < n, then this work 
effort is no longer negligible, as the function / will have more collisions than 
expected. The expected number of false hits per table look up is bounded by 
Nt{t + l)/2^+^ ~ Nt'^/2^+\ 

The success probability of the TMTO attack algorithm is determined by the 
number a of elements in the known key set V, where cr can be bounded by, 

N t 
i=l,i=l. 

Using the choice of parameters N = t = 2"/^ suggested in [11], then a ~ . 

Below, we assume these values. 

To use Heilman’s time-memory tradeoff in an attack against linearly redun- 
dant plaintext encrypted with a stream cipher, / is defined as a mapping from 
keys to known hallmarks : 

f{k) = Ls{k)®e. (15) 

The known key set of hallmarks is the ‘logical table’ comprised of the iterates of 
S used in computing T, and the number of distinct elements that it contains is 

fj. 

If the TMTO is done on a set of M unknown hallmarks simultaneously, a set 
Y must be computed for each unknown hallmark, and the union of the unknown 
key sets has cardinality tM. In addition, if n > I, the time taken to check false 
hits must be accounted for. The expected work effort w of the TMTO attack is 
thus for n < Z, 

w = tM lg{tM) +tM + N+ (16) 

= 0{tMlg{tM) + N). 




and for n > I, 

w = tM Ig(tM) + tM + N+ -k MNt^/2^+^ (17) 

2 " 

= 0{m\g{tM) + N + MNt^ /2^). 



The expected number of correct keys that this algorithm finds is M(t/ 2” Ri 
M2“"/^. Thus the effective key size rjT of the TMTO attack is given by 

, tMlgUM) + N + MNt^ /2^ , , 

= Ig M2-V3 

= 2n/3 -k Ig (u/3 + \gM + 1/M + 2""') 

~ 2n/3 + lgn/3 + max(n — /, 0). 
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l/n 



Fig. 3. The ‘phase space’ of attacks on additive encryption, showing which attack dom- 
inates as a function of the parameters IgM and 1 . Here, the parameters are represented 
fractionally in terms of the key size n 



4.1 Comparison of the TMTO and Key Collision Algorithms 

The TMTO attack is more effective than the basic attack when rjT < rj- By 
comparing the estimates for 77 and rjT given above, we can see that the KC 
attack is preferable when lg(M) > n/3. The complete ‘phase space’ of attacks 
on additive encryption is illustrated in Figure 3. 

However, the TMTO as described does not work for probabilistic linear equa- 
tions. In that case, the KC attack has the advantage. 

5 Linear Redundancy in IP Packets 

We analyzed the IP, TCP, and UDP protocols, and estimated the linear redun- 
dancy in the headers of those protocols. A summary of our results is given in 
Table 1. In this section, all numerals indicate binary expressions. 

The Version field is (almost without exception) equal to 0100. The Header 
Length is nearly always equal to 0101, unless an IP option is used, in which case 
it is probably 0110. The Precedence/TOS (Type of Service) field is generally set 
to 00000000. The Protocol field is usually 00000110 (for TCP) or 00001011 (for 
UDP). The ‘Time to Live’ field is usually 00010000 or less. The ‘Source IP’ and 
‘Destination IP’ fields from the IP header are the same in every packet between 
to particular hosts. Each pair of packets with the same source and destination 
that can be identified by traffic analysis provides 64 linear equations. The ‘Source 
Port’ and ‘Destination Port’, in the TCP and UDP protocols, provide a total of 
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Table 1. Linear redundancy in the headers of the IP, TCP, and UDP protocols. The 
common values are described in Section 5. The ‘Single Packet’ column shows the re- 
dundancy that is detectable in a single packet. The ‘Two Packet’ column shows the 
redundancy that is present in two consecutive packets from the same source 



Protocol 


Field 


Size (bits) 


Single Packet 
Redundancy (bits) 


Two Packet 
Redundancy (bits) 


IP 


Version 


4 


4 


4 


Header Length 


4 


4 


4 


Precedence/TOS 


8 


8 


8 


Packet Length 


16 


4 


4 


Packet ID 


16 


0 


0 


DF bit 


1 


1 


1 


MF bit 


1 


0 


0 


Fragment Offset 


13 


0 


0 


Time to Live 


8 


3 


3 


Protocol 


8 


7 


7 


Checksum 


16 


1 


1 


Source Address 


32 


0 


32 


Destination Address 


32 


0 


32 


Total 


- 


32 


96 


UDP 


Source Port 


16 


0 


16 


Destination Port 


16 


0 


16 


Length 


16 


0 


0 


Checksum 


16 


1 


1 


Total 


- 


1 


33 


TCP 


Source Port 


16 


0 


16 


Destination Port 


16 


0 


16 


Sequence Number 


32 


0 


18 


Ack. Number 


32 


0 


14 


Data Offset 


4 


4 


4 


Checksum 


16 


1 


1 


Urgent 


8 


0 


0 


Total 


- 


5 


69 



32 linear equations in the same manner. The TCP ‘Data Offset’ field is usually 
set to 0101. 



5.1 Checksums and Counters 

Many protocols use checksums so that transmission errors are likely to be de- 
tectable by the receiver. A checksum is an element of a ring, usually F 2 or Z/2'”, 
for some value of c. It is computed by decatenating the data into elements of 
that ring, then summing them together. Checksums over F 2 are conventional 
when the protocol is implemented in hardware, while checksums over Z/2‘‘ are 
commonly implemented in software (and are used for IP, TCP, and UDP with 
c= 16.). 



26 



David A. McGrew and Scott R. Fluhrer 



A checksum over (or CRC) provides c linear equations that always hold. 
The Bluetooth specification for wireless networking is one example of a protocol 
that includes such a checksum on data that is encrypted by an additive cipher 
[3]. A checksum over Z/2” provides one linear equation that always holds, since 
the least significant bit of a sum of integers is equal to the exclusive or of the 
least significant bits of the integers. Probabilistic linear equations in other bits 
of the checksum can be derived, but will be poor approximations if the number 
of integers summed together is large. 

In many protocols, an integer called a counter is included in each packet, and 
is used to indicate the ordering of the packets to the receiver. Counters may be 
incremented by one for each new packet, or may be incremented by some other 
value (e.g., the number of bytes contained in the data portion of the packet, as 
is done in the TCP protocol). If a c-bit counter x appears in a packet, and x + y 
appears in another packet, where y < 2'^, for some q, then 

Xi+q = (a; + y)i+q with probability > 1 — 2~\ (19) 

A c bit counter that increments by a value less than 2“ provides a significant 
amount of information. 

IP, TCP, and UDP all use checksums over Z/2^®. The low bit of the checksum 
is a linear function of the other packet data, from Section 5.1. If the layer three 
protocol of a packet is known, then the checksums provide two linear equations 
that hold with probability one. 

TCP packets contain a 32-bit counter that is incremented by the length (in 
bytes) of the packet’s data. These lengths will be no more than 1500 (which 
is the Ethernet MTU) with high probability. Since 2^^ > 1500, two adjacent 
counters provide 18 linear equations that hold with probability 7/8 or greater. 
To use these equations in an attack requires some traffic analysis to discover 
two sequential TCP packets. The TCP ‘Acknowledgement Number’ similarly 
provides about 14 linear equations. 

6 Conclusions 

Practical attacks on additive stream ciphers that rely on linear equations over 
the plaintext bits are possible, even when those equations hold probabilistically. 
The IP, TCP, and UDP protocol headers have a significant amount of linear 
redundancy, and are vulnerable to these attacks. In practice, effective key sizes 
of Internet encryption are close to n — Ig M , when a cryptanalyst has M cipher- 
texts encrypted under distinct keys available. We conjecture that only protocols 
specifically designed to not be linearly redundant will not be vulnerable to these 
attacks. Compression would reduce the linear redundancy of a source; however, 
we are pessimistic about the effectiveness of using compression to protect against 
our attacks in practice. 

While our attacks are powerful, there is an easy defense against them: increase 
the key size of the cipher. Cipher keys can be extended in ways that are not 
secure against other forms of attack (e.g., ‘whitening’ with a fixed value) and 
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still provide resistance to our attacks. This approach is similar to the idea of 
concatenating ‘salt’ (e.g., unique but public data) to a secret password in order 
to reduce the effectiveness of attacks that amortize effort across many passwords, 
variable size key, although in common usage its key size is 128 bits. 

The attacks that we outline are possible against Internet traffic encrypted 
with 128-bit RC4 with a complexity of about 2®®, assuming that an adversary can 
intercept ciphertexts from 2^° distinct sessions. This number is feasible; a single 
Internet site that establishes 2®^ SSL connections per day has been reported [17]. 
While this attack is beyond the limit of current cryptanalytic technology, it is 
worth noting that it does no harm to increase the key size to compensate for our 
attacks: the throughput of the RC4 cipher is independent of its key size. 

The attacks that we presented rely on the fact that the secret keys are chosen 
uniformly at random. If the keys are chosen from a highly skewed probability 
distribution (e.g., a broken random number generator that outputs the same 
number every time), the effectiveness of our attacks is significantly reduced. Of 
course, the broken random number generator creates other security problems! 

Considerable future work remains untouched. While we established the vi- 
ability of attacks relying on the redundancy of plaintext encrypted by additive 
stream ciphers, we did not investigate efficient decoding methods for use when 
the linear equations are probabilistic. Also, it may be possible to extend the 
time-memory tradeoff approach so that it can be used in the probabilistic case. 
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Abstract. At Crypto’90, Koyama and Terada proposed a family of 
cryptographic functions for application to symmetric block ciphers. 
Youssef and Tavares showed that this family is affine and hence it is 
completely insecure. In response to this, Koyama and Terada modified 
their design, by including a data dependent operation between layers. 
The modihed family of circuits was presented in the first international 
security workshop (ISW’97). In this paper, we show that the modified 
circuit can be easily broken by a differential-like attack. More explic- 
itly, we show that after d rounds, and for any specific key K, the input 
space can be partitioned into M < 2'^ sets such that the ciphertext Y 
of each set is related to the plaintext X by an affine relation. The ex- 
pected value of M -C 2“*. Our attack enables us to explicitly recover 
these linear relations. We were able to break an 8— round 64— bit version 
of this family in few minutes on a workstation using less than 2^° chosen 
plaintext-ciphertext pairs. 

Keywords: Block cipher, cryptanalysis, augmented parity circuits 



1 Introduction and Definitions 

Koyama and Terada [2] proposed a family of cryptographic functions called 
“non-linear” parity circuits. Youssef and Tavares [7] showed that this family 
of functions is affine over GF{2) and hence it is completely insecure. In [3], 
Koyama and Terada introduced a random involution called Value-Dependent- 
Swapping (VDS). In the VDS, the left half and the right half of a sequence of 
bits are swapped if its parity is odd. In [4], [5] the VDS was incorporated into 
DES in order to make it stronger against differential and linear cryptanalysis. 
By including this VDS in the parity circuits proposed in [2], Koyama and Terada 
obtained what they called an augmented version of their cryptographic functions 
family. The following definitions are given in [3]. 

Definition 1. Let x = L\\R be a sequence of 2k, k > 0 bits where L stands 
for left half of x and R stands for right, length{L) = length{R) = k. A value 
dependent swapping, or V{x), is defined to be 
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R\ \L ifh{x) = 0, 
L||i? ifh{x) = 1 , 



where h{x) G 0 , 1 . 




( 1 ) 



Definition 2. Let x = Xi||xr- he a sequence of 2k, k > 0 bits where xi stands for 
lefthalfofx andxr stands for right, length(xi) = length(xf) = k. A VDS, which 
is an involution value-dependent-swapping based on the parity of the weight of 
X, is defined to be 



V{x) 



a;r||a;; if weight{x) is odd, 
a;;||a;r- if weight{x) is even, 



(2) 



where weight{x) is the number ofl’s in the bit sequence x. 



Definition 3. A parity layer with length n, or simply an L{n) circuit layer, is a 
Boolean device with an n-bit input and n-bit output, characterized by a key that 
is a sequence of n symbols from 0 , 1 ,+,—. 



Definition 4. A function B = f{K,A) computed by an L(n) circuit layer with 
key K = k\k 2 ■ ■ ■ kn G {0, 1, +, — }" is the relation from an n-bit input sequence 
A = aifl 2 • • • a„ G {0, 1}" to an n-bit sequence B = 6162 • • • G {0, 1}" defined 
below. An L(n) circuit layer computes first the variable T modulo 2 such that 

n 

T = ^t„ (3) 

1=1 



where 



] 



1 if {kj = 0 and Qj = 0 ) or (kj = 1 and aj = 1 ), 
0 Otherwise. 



The output B = b\b 2 ■■■ bn of the circuit layer is then 



( 4 ) 



kj = — and T = 1 



or 



bj = 



Qj if < kj = and T = 0 



^3 

or 



Otherwise. 



( 5 ) 



Definition 5. A parity circuit of width n and depth d, or simply C(n, d) circuit, 
is a matrix of d L(n) circuit layers with keys denoted by K = K 1 WK 2 ■ ■ ■ for 
which the n output bits of the {i — l)-th circuit layer are the n input bits for the 
i-th circuit layer, for 2 < i < d. The key for the C(n, d) circuit is a dx n matrix 
with its d lines containing circuit layer keys. 
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Table 1. C+{n, d) with n = 10 and d = 3 



Input 


1 


0 


1 


1 


0 


0 


1 


0 


0 


1 


Swap 


7Ci 


- 


0 


1 


- 


-f 


+ 


1 


1 


- 


+ 




Output 


0 


0 


1 


1 


1 


0 


0 


0 


0 


0 


yes 


K2 


+ 


1 


0 


1 


1 


+ 


0 


- 


+ 


- 




Output 


0 


1 


1 


0 


0 


0 


0 


1 


0 


1 


no 


Ki 


- 


0 


1 


+ 


-f 


0 


- 


+ 


+ 


- 




Output 


0 


0 


0 


1 


1 


0 


1 


0 


1 


1 


yes 



Let F be the function from {0, 1}" to {0, 1}" computed by a circuit C(n, d) 
with key = K\\K 2 ■ ■ ■ Kd. That is F{K, A) is defined as 

F(iC, A) = f{Kd, f{Kd-u- • • , /(/Cl , A) ■ • •). (6) 

By showing that, for any fixed key, the C(n,d) circuit can be constructed using 
XOR gates only, Youssef and Tavares [7] showed that -F(iC, A) above is affine 
over GF{2). 

Definition 6. A function B = /+(//, A) computed hy an augmented L(n) cir- 
cuit layer with key K, or simply L^in) layer, is the function V(f{K,A)), where 
V is the VDS function as in Definition 2, and f is the function computed by an 
L(n) circuit layer. 



Definition 7. A augmented parity circuit of width n and depth d, or simply 
C+{n,d) circuit, is a matrix of d L+{n) circuit layers with keys denoted by 
K = Ki\\K 2 ‘ ■ ■ Kd for which the n output bits of the (i — l)-th circuit layer 
are the n input bits for the i-th circuit layer, for 2 < i < d. The key for the 
C_|_(n,(i) circuit is a d x n matrix with its d lines containing circuit layer keys. 
A /L|_ function from {0,1}" to {0,1}" computed by a circuit C{n,d) with key 
= K\\K 2 ---Kd as 

F+{K, A) = f+{Kd, f+{Kd-i. • • • , /+(/Ci, A) ■ ■ •). (7) 

Table 1 shows the example given in [3] for a C'+(n, d) circuit with n = 10 and 
d = 3 

2 Cryptanalysis of the C_|_(n, d) Circuit 

Since the C{n,d) circuit is affine [7], the C+{n,d) circuit can be viewed as a 
composition of key-dependent affine transformations and the VDS layer (see 
Figure 1). Thus the security of the C+(n,d) relies heavily on the cryptographic 
strength of the VDS layer. 
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X 




Y 



Fig. 1. The C+ (n, d) viewed as a composition of afHne and VDS layers 

Observation 01 For any specific key k, the ciphertext Y of the C+(n, d) circuit 
is related to the plaintext X by one of the affine relations 

Y = A,{k)X(Bb,{k), (8) 

where i = Ai(k) is a key-dependent non singular binary matrix, 

bi{k) is a key-dependent n x 1 binary vector and M <2'^. 

Proof. Let VDSi denote the swap variable at round i. I.e., VDSt = 0 if the parity 
of the input to the VDS layer at round d is even and VDSi = 1 if this parity is 
odd. Thus VDSi G {0, 1} and hence for a C+{n, d) circuit, VDSi, ■ ■ ■ ,VDSd G 
{0, 1}'^. Thus the input space of the C+{n,d) circuit can be partitioned into 2'* 
sets 

Si,S2---,S^., (9) 

where for any fixed 1 < i < 2*^, VDSi, ■ ■ ■ , VDSd is fixed and hence the d VDS 
layers can be modeled by fixed bit permutation layers. The output Y corre- 
sponding to the input X € Si can be obtained by a composition of fixed affine 
relations and hence Y is related to X by a fixed affine relation for all X G Si. 
Since there is no guarantee that all the 2‘^ possible values of VDSi, ■ ■ ■ , VDS 2 <i 
will appear, then M <2‘^. □ 

Figure 2 illustrates the C+{n,d) equivalent circuit according to observation 
01 above. The following observation illustrates how the “swap control” function 
in this figure operates. By noting that VDSi is a linear function of the input to 
layer i, then we have 

Observation 02 Inputs that belong to the same set in observation 01 above 
must satisfy a set of d linear equations. 
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Fig. 2. Equivalent circuit of the C+{n,d) according to observation 01 



For a given known key, these 2‘^ {d linear relations) can be derived by cal- 
culating the parity of the input to the d VDS layers in terms of the input X. 
If some of these linear relations don’t have a solution, then M will be less than 
2'^. Figure 4 shows the linear relations corresponding to Example 1 in [3]. Note 
that for this particular example, we have more than one possible solution for 
AiS and biS. Figure 4 shows only one of these possible solutions. While obser- 
vations 01 and 02 are enough to cause uneasy feeling when using the C+{n,d) 
for most practical values of d, we extend our attack to find these linear rela- 
tions. The main idea is to develop an algorithm that can be used to group the 
input/output pairs that belong to the same set Si and then solve a set of linear 
equations to find the Matrix Ai and the vector bi. The attack makes use of the 
following observation 

Observation 03 For the C+(n,d), if the input o,nd belong to the set 

Si, then 

Ra=R3® (Ri © R 2 ) 

belongs to the same set Si. 

Proof. If R\ , i ?2 and R 3 € Si then they must satisfy a set of d linear equations 
in the form 

CRi = b, CR 2 = b, CR 3 = b, 

where C is an d x n matrix and 6 is a d x 1 vector. The observation is proved 
by noting that 

Ci?4 = CR 3 © CR 2 © CR 2 = b 
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1. Ri = RandomQ 

2. do 

3. { 

4. pass = 0 

5. i ?2 = Randomi) 

6 - dx = Ri 0 R2 

7. for f = 1 to i = Trials 

8. { 

9. R 3 — Randomi) 

10. T?4 = i?3 © 5 ^ 

11 . 5y = F+iRi) © F+{R2) © F+{Ri) © F+{Ri) 

12. if {5y = 0) increment pass 

13. } 

14. if (pass > Threshold) Declare Ri and R 2 G same set 

15. jwhile number of collected pairs < P 

Fig. 3. Basic steps in the attack 



and hence R4 also satisfy this set of equation. Thus R4 must belong to the same 
set Si- □ 

Note that if i?i, i?2, and i?3 S Si then for any key K 

F+{K,R4) © F+{K,R2) © F+{K, R3) © F+{K, {R3 (B {Ri (B R2))) = 0 ( 10 ) 

In our attack, we pick random triples i?i , R2 and R3 and test for the condition 
in equation ( 10 ). Since there is no guarantee that R3 will belong to Si even if Ri 
and i?2 do, we repeat the test for different values of R3 {Trials in Figure 3 ). We 
decide that R\ and R2 are in the same set if the condition is satisfied for a large 
number of times {Threshold in Figure 3 ). Wrong decisions by the algorithm 
(i.e., if the algorithm declares that R2 and i?i are in the same set while they 
are not) can be filtered out by collecting more than n + 1 pairs (e.g., P = 2 n 
pairs) because with high probability the resulting set of equations we will try to 
solve will be inconsistent if the algorithm accepts wrong pairs. Another method 
to prevent the algorithm from accepting wrong pairs is to increase the value of 
Trials and make the value of Threshold very close to Trials. However, this 
may increase the number of plaintext-ciphertext pairs required to break the 
algorithm. Throughout these experiments, the value of Threshold was set based 
on the statistics of the pass variable (see Figure 3 ). We set Threshold close to 
the maximum value of pass. 

3 Analysis of the Algorithm and Experimental Resnlts 

Assuming that the size of the input sets are equal, then the probability that 
Ri , i?2 and R3 are in the same set is where M is the number of partitions. 
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Table 2. Average number of sets versus optimal value for n = 10 



d 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


13 


14 


15 


16 


Average{M) 


2 


3 


4 


7 


11 


15 


25 


37 


57 


62 


100 


143 


162 


232 


325 


393 


mm(2‘^, 2") 


2 


4 


8 


16 


32 


64 


128 


256 


512 


1024 


1024 


1024 


1024 


1024 


1024 


1024 



The maximum value for M is min{2‘^, 2"). Thus the number of chosen plaintext- 
ciphertext pairs required for the attack increases with M^. In other words, the 
success of the attack depends heavily on the number of the input partitions. The 
intensive use of bit oriented operations in the C+(n,d) circuits puts an upper- 
bound on d, and consequently M, for any efficient software implementation. 
The average number of partitions for n = 10 is shown in Table 2. Each point 
represents an average over 100 C+{n,d) circuits with randomly selected keys. It 
is clear that this number is much less than the optimum value max{2‘^, 2"). Our 
experimental results shows that this large deviation from the optimum case holds 
for larger block lengths. It is also easy to prove that if the key K is restricted 
to the set {0,1} instead of (0,1, -I-,—} , then M <2 for all d > 1. Note that 
because we don’t know M in advance, it is hard to optimize the choice of Trials 
and Threshold to minimize the number of plaintext-ciphertext pairs required 
for the attack. Moreover, our experiments shows that the C+{n,d) circuit fails 
to behave like a random function for practical values of d and hence it is not 
easy to predict the probability of wrong pairs satisfying equation (10) based on 
the random function model. The good point (from the attacker point of view) is 
that the attack works almost all the time. In many cases, we were able to break 
an 8— round 64— bit version of this family in few minutes on a workstation using 
less than 2^° chosen plaintext-ciphertext pairs. 

Remark 1. The non-affineness defined in [3] doesn’t provide a useful measure of 
resistance against linear attacks. The nonlinearity of a function / is defined as 
the minimum distance between the set of affine functions and all the non-zero 
linear combinations of the output coordinates of / [6]. Our experiments shows 
that for practical values of d, the average nonlinearity of the (n, d) circuits is 
very poor compared to the expected nonlinearity of randomly selected functions 
of the same size n. Thus it is conceivable that the C^{n,d) circuit be broken 
using a variant of linear cryptanalysis [6] . 



4 Conclusion 

The security of the C+(n, d) circuit relies only on the cryptographic strength 
of the VDS function because the rest of the circuit is affine. Controlling the 
swapping based on the parity results in a cryptographically weak function. Thus 
for practical values of n and d, the augmented family of parity circuits C'+(n, d) 
proposed by Koyama and Terada is insecure. 
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Fig. 4. Linear relations for Example 1 in [3] 
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Fig. 4. (continued) 
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Abstract. We present a new 128-bit block cipher called Camellia. 
Camellia supports 128-bit block size and 128-, 192-, and 256-bit keys, 
i.e., the same interface specifications as the Advanced Encryption Stan- 
dard (AES). Efficiency on both software and hardware platforms is a 
remarkable characteristic of Camellia in addition to its high level of se- 
curity. It is confirmed that Camellia provides strong security against 
differential and linear cryptanalyses. Compared to the AES finalists, i.e., 
MARS, RC6, Rijndael, Serpent, and Twofish, Camellia offers at least 
comparable encryption speed in software and hardware. An optimized 
implementation of Camellia in assembly language can encrypt on a Pen- 
tium III (800MHz) at the rate of more than 276 Mbits per second, which 
is much faster than the speed of an optimized DES implementation. In 
addition, a distinguishing feature is its small hardware design. The hard- 
ware design, which includes encryption and decryption and key schedule, 
occupies approximately IIK gates, which is the smallest among all ex- 
isting 128-bit block ciphers as far as we know. 



1 Introduction 

This paper presents a 128-bit block cipher called Camellia, which was jointly 
developed by NTT and Mitsubishi Electric Corporation. Camellia supports 128- 
bit block size and 128-, 192-, and 256-bit key lengths, and so offers the same 
interface specifications as the Advanced Encryption Standard (AES). The design 
goals of Camellia are as follows. 

High Level of Security. The recent advances in cryptanalytic techniques are re- 
markable. A quantitative evaluation of security against powerful cryptanalytic 
techniques such as differential cryptanalysis [4] and linear cryptanalysis [18] is 
considered to be essential in designing any new block cipher. We evaluated the 
security of Camellia by utilizing state-of-art cryptanalytic techniques. We have 
confirmed that Camellia has no differential and linear characteristics that hold 
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with probability more than Moreover, Camellia was designed to offer secu- 

rity against other advanced cryptanalytic attacks including higher order differen- 
tial attacks [13,10], interpolation attacks [10,2], related-key attacks [5,15], trun- 
cated differential attacks [13,23], boomerang attacks [26], and slide attacks [6,7]. 

Efficiency on Multiple Platforms. As cryptographic systems are needed in var- 
ious applications, encryption algorithms that can be implemented efficiently on 
a wide range of platforms are desirable, however, few 128-bit block ciphers are 
suitable for both software and hardware implementation. Camellia was designed 
to offer excellent efficiency in hardware and software implementations, including 
gate count for hardware design, memory requirements in smart card implemen- 
tations, as well as performance on multiple platforms. 

Camellia consists of only 8-by-8-bit substitution tables (s-boxes) and logical 
operations that can be efficiently implemented on a wide variety of platforms. 
Therefore, it can be implemented efficiently in software, including the 8-bit pro- 
cessors used in low-end smart cards, 32-bit processors widely used in PCs, and 
64-bit processors. Camellia doesn’t use 32-bit integer additions and multiplica- 
tions, which are extensively used in some software-oriented 128-bit block ciphers. 
Such operations perform well on platforms providing a high degree of support, 
e.g., Pentium II/III or Athlon, but not as well on others. These operations can 
cause a longer critical path and larger hardware implementation requirements. 

The s-boxes of Camellia are designed to minimize hardware size. The four 
s-boxes are affine equivalent to the inversion function in the finite field GF(2®). 
Moreover, we reduced the inversion function in GF(2®) to a few GF(2'^) arith- 
metic operations. It enabled us to implement the s-boxes by fewer gate counts. 

The key schedule is simple and shares part of its procedure with encryption. 
It supports on-the-key subkey generation and subkeys are computable in any 
order. The memory requirement for generating subkeys is quite small; an efficient 
implementation requires about 32-byte RAM for 128-bit keys and about 64-byte 
RAM for 192- and 256-bit keys. 

Outline of the Paper. This paper is organized as follows: Sect. 2 describes the 
notations and high-level structure of Camellia. Section 3 defines each components 
of the cipher. Section 4 describes the rationale behind Camellia’s design. In 
Sect. 5 we evaluate Camellia’s strength against known attacks. Section 6 contains 
the performance of Camellia. We conclude in Sect. 7. 

2 Structure of Camellia 

Camellia uses an 18-round Feistel structure for 128-bit keys, and a 24-round Feis- 
tel structure for 192- and 256-bit keys, with additional input/output whitenings 
and logical functions called the FL-function and FL“^-function inserted every 
6 rounds. Figures 1 shows an overview of encryption using 128-bit keys. An 
element with the suffix shows that the element is n-bit long. 

The key schedule generates 64-bit subkeys kwt{t = 1, 2, 3, 4) for input/output 
whitenings, ku {u= 1, 2, . . . , r) for round functions and kly [v = 1,2,..., r/3 — 2) 
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Fig. 1. Encryption procedure of Camellia for 128-bit keys 



for FL- and FL ^-functions from the secret key K, where r is the number of 
rounds. 



2.1 Notations 

Xl, Xu : the left-half and the right-half data of X, respectively. 

©,n,U : bitwise exclusive-OR (XOR), AND and OR operation, respectively. 
II : concatenation of two operands. 

: rotation to the right and the left by n bits, respectively. 

Ox : hexadecimal representation. 



2.2 Encryption for 128-Bit Keys 

First a 128-bit plaintext M is XORed with kwi\\kw 2 and separated into two 
64-bit data Lq and Rq, i.e., M © {kwi\\kw 2 ) = Aoll^o- Then, the following 
operations are performed from r = 1 to 18, except for r = 6 and 12; 

— -R,. — 2 © R (.Rr — 1 ; ) : Rj' — Ly' — 1 . 
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For r = 6 and 12, the following is carried out; 

Lj, = Rr—l 0 F(Lr—l, kr), R'j. = Lr—l, 

Lr = FL{L'j.,klr/3-i), Rr = FL ^{R'j,,klr/3)- 

Lastly, i?i8 and Lig are concatenated and XORed with kw^Wkw^. The resul- 
tant value is the 128-bit ciphertext, i.e., C = (RisHLis) 0 {kwi\\kwiC)- 

2.3 Encryption for 192- and 256-Bit Keys 

Similarly to the encryption for 128-bit keys, first a 128-bit plaintext M is XORed 
with kwi\\kw2 and separated into two 64-bit data Lq and Rq, i.e., M(B{kwi\\kw2) 
= Lq\\Rq. Then, the following operations are performed from r = 1 to 24, except 
for r = 6, 12, and 18; 

— R,1 ^ — 2 0 F (^Lj ‘ — 2 ; Rj' — . 

For r = 6, 12, and 18, the following are performed; 

L[. = Rr—l 0 F(Lr—l, kr), R'r = Lr—\, 

Lr = FL{L'j.,klr/3-i), Rr = FL ^(R'^jklr/s). 

Lastly, i?24 and L24 are concatenated and XORed with kw3\\kw4. The resul- 
tant value is the 128-bit ciphertext, i.e., C = (R24IIT24) 0 {kwi\\kw 4 ). 

2.4 Decryption 

The decryption procedure of Camellia can be done in the same way as the 
encryption procedure by reversing the order of the subkeys, which is one of 
merits of Feistel networks. In Camellia, EL/FL“^-function layers are inserted 
every 6 rounds, but this property is still preserved. 

2.5 Key Schedule 

Figure 2 shows the key schedule of Camellia. Two 128-bit variables and Kji 
are defined as follows. For 128-bit keys, the 128-bit key K is used as Kl and 
Kn is 0. For 192-bit keys, the left 128-bit of the key K is used as Kl, and 
concatenation of the right 64-bit of K and the complement of the right 64-bit of 
K is used as Kr- For 256-bit keys, the left 128-bit of the key K is used as Kl 
and the right 128-bit of K is used as Kn. 

Two 128-bit variables Kj\ and Kb are generated from Kl and K^ as shown 
in Fig. 2. Note that Kb is used only if the length of the secret key is 192 or 256 
bits. The 64-bit constants Si (i = 1,2, ..., 6) are used as “keys” in the Feistel 
network. They are defined as continuous values from the second hexadecimal 
place to the seventeenth hexadecimal place of the hexadecimal representation of 
the square root of the i-th prime. These constant values are shown in Table 1. 

The 64-bit subkeys kwt, and kly are generated from Kl, Kb, Ka, and 
Kb- The subkeys are generated by rotating Kl, Kb, Ka, and Kb and taking 
the left- or right-half of them. Details are shown in Table 2. 
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Table 1. The key schedule constants 
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Table 2. Subkeys for 128-bit keys and 192/256-bit keys 



192/256-bit keys 


subkey 


value 


Prewhitening 


kwi 


{Kl^o)l 




kW2 


{Kl^o)r 


F ( Round 1) 


ki 


{Kb^o)l 


F (Round2) 


k2 


(Kb<^o)r 


F (Rounds) 


ks 


(Kr<^13)l 


F (Round4) 


ki 


{Kr^15)r 


F (Round5) 


fcs 


{Ka<^is)l 


F (Round6) 


^6 


{Ka^15)r 


FL 


kh 


{Kr^3o)l 


FL~^ 


kl2 


{Kr^3o)r 


F (Round?) 


kv 


(Kb<^3o)l 


F (Round8) 


kg 


(it's^3o)ii 


F (Round9) 


kg 


{KL^isih 


F (RoundlO) 


kio 


(A'l,<^45)fl 


F (Roundll) 


fell 


(A'A^45)i 


F (Roundl2) 


k\2 


(KA<^i5)R 


FL 


kh 


{Kl^6o)l 


FL~^ 


kli 


{Kl^6q)r 


F (RoundlS) 


ki3 


{KR^go)L 


F (Roundl4) 


kli 


{Kr^6o)r 


F (Roundl5) 


ki5 


{Kb<^6o)l 


F (Roundie) 


kie 


(Kb<^6o)r 


F (Roundl?) 


ki7 


(Kl‘^7i)l 


F (Roundl8) 


kis 


{Kl^77)r 


FL 


kh 


{Ka<^77)l 


FL~^ 


klQ 


{Ka^^77)r 


F (Roundl9) 


kig 


{KR<^gi)L 


F (Round20) 


k20 


{KR<^gi)R 


F (Round21) 


k2i 


(KA^gi)L 


F (Round22) 


k22 


(KA^gi)R 


F (Round23) 


k23 


{Kl^11i)l 


F (Round24) 


k2i 


{Kl^11i)r 


Postwhitening 


kW3 


{Kb^11i)l 




kwi 


(Kb<^11i)r 



128-bit keys 


subkey 


value 


Prewhitening 


kwi 


{Kl^o)l 




kW2 


{Kl<^o)r 


F (Roundl) 


ki 


{Ka^o)l 


F (Round2) 


k2 


(Ka^o)r 


F (Rounds) 


k3 


{Kl^15)l 


F (Round4) 


ki 


{Kl<^15)r 


F (Rounds) 


fcs 


(Ka^15)l 


F (Round6) 


ke 


{Ka^15)r 


FL 


kh 


{Ka^3o)l 


FL~^ 


kh 


(Ka^3o)r 


F (Round?) 


k7 


(A'l<^45)u 


F (Rounds) 


kg 


(A'l<^45)-R 


F (Round9) 


kg 


(A'A<^45)i 


F (RoundlO) 


kig 


{Kl<^6o)r 


F (Roundll) 


fell 


{Ka^6o)l 


F (Roundl2) 


ki2 


{Ra^6o)r 


FL 


kh 


{Kl<^77)l 


FL~^ 


kli 


{Kl<^77)r 


F (Roundl3) 


ki3 


{KL-^gi)L 


F (Roundl4) 


kli 


{KL^gi)R 


F (RoundlS) 


kig 


{KA^9i)L 


F (Roundl6) 


kie 


(A'a<^94)h 


F (Roundl?) 


ki7 


{Kl^iii)l 


F (RoundlS) 


kig 


{Kl^11i)r 


Postwhitening 


kW3 


{Ka^iii)l 




kwi 


{Ka^iii)r 



44 



Kazumaro Aoki et al. 



^L(128) ® Kr{I28) 




^A(128) 



^B(12S) 



Fig. 2. Key schedule 



^5(64) 

1,6(64) 



3 Components of Camellia 

3.1 F’-Punction 

The F-function is shown in Fig. 3. The F-function uses the SPN (Substitution- 
Permutation Network) structure. The S'-function is the non-linear layer and the 
F-function is the linear layer. 

3.2 S'-Function, s-Boxes 

The F-function consists of eight s-boxes, and four different s-boxes, si, S 2 , S 3 , 
and S 4 are used. All of them are affine equivalent to the inversion function in 
GF(2®). The data of S2, S3, and S4 can be generated from the si table. The 
tables are shown in [ 1 ]. 

51 : GF(2)® — GF(2)®, x >->■ h(g(f(0xc5 0 x))) © 0 x 6 e 

5 2 : GF(2)® ^ GF(2)®, x Si(x)<^i 

5 3 : GF(2)® ^ GF(2)®, a: si(x):^i 

5 4 : GF(2)® ^ GF(2)®, a: si(a;<^i) 



Camellia-. A 128-Bit Block Cipher Suitable for Multiple Platforms 



45 



ki (64) 




S-Function P-Function 

Fig. 3. F-function 



Z 8(8) 
Z' 7(8) 

Z’ 6(8) 
Z’ 5(8) 
Z’ 4(8) 
Z’ 3(8) 
Z’ 2(8) 
Z’ 1(8) 



y 



where functions f and h are affine functions and function g is the inversion 
function in GF(2®) as given below. 

f : GF(2)® — >■ GF(2)®, (oi, 02 , . . . , as) e - 1 ( 61 , 62 , , bs), 



where 

^I = a 6 ©« 2 , &2 = O 7 0 ai, 63 = 08 0 05 0 03, 64 = Os 0 O3, 

65 = 07004, 65 = 05 0 02 , 67 = 08 0 01, 68 = 06 0 04. 

h : GF( 2 )® — >■ GF( 2 )®, (oi, 02, . . . , 08) (61, 62, ... , bs), 

where 

61 = 05 0 06 0 O2, 62 = 06 0 O2, 63 = O7 0 O4, 64 = 08 0 O2, 

65 = 07003, 6e = 08 0 01, 67 = 05001, 68 = 06 0 03. 

g : GF( 2 )® — >■ GF( 2 )®, (oi, 02, . . . , as) >— f (61, 62, ... , 63), 

where 



{bs 0 67a 0 bea ^ 0 650^) 0 (64 0 63a 0 620^ 0 6 ia ^)/3 
= ((o80O7a0O6Q!^0O5a^)0(o40O3Q;0O2a^0oia^)/3)“^. 

This inversion is performed in GF(2®) assuming 0“^ = 0, where (3 is an 
element in GF(2®) that satisfies 0 /3® 0 /3® 0 0 1 = 0, and a = = 

y + y + y + is an element in GF(2'^) that satisfies 0 a 0 1 = 0. 
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Fig. 4. FL-function 
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Fig. 5. FL ^-function 



3.3 P-Punction 



The P-function is defined as follows: 

P : (GF(2)8)8 ^ (GF(2)®)8, (zi, Z2, ...,28)^ 4, • ■ • , 4), 



where 



z'l = Zl © Z3 © 24 © ^6 0 27 0 2g, 

2g = 2i © Z2 © 23 © 25 © Z6 © 28, 

2 g = 2 i © 22 © 20 © 27 © 2 g, 

27 = 23 © 24 © 25 © 20 © 2g, 



22 = 2i © 22 © 24 © 25 © 27 © 2g, 

24 = 22 © 23 © 24 © 25 © 20 © 27, 

20 = 22 © 23 © 25 © 27 © 2 g, 

2g = 2i © 24 © 25 © 20 © 27. 



3.4 PT-Function and FL ^-Function 

The PL-function is shown in Fig. 4, and is defined as follows. 

FL : GF(2)64 X ^ GF(2)64, ^XlWXr, khWhlR) ^ Yl\\Yr, 

where 

Yr = {{Xr n klR)<^i) © Xr, Yr = {Yr U Mr) © Xr. 

The PL“^-function is shown in Fig. 5. The following equation holds. 

FL~^{FL{x, k), k) = x. 



4 Design Rationale 

4.1 P-Function 

The design strategy of the P-function of Gamellia follows that of the P-function 
of E2 [14]. The main difference between E2 and Gamellia is the adoption of 
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the 1-round (conservative) SPN, not the 2-round SPN, i.e., S-P-S. When the 
1-round SPN is used as the round function in a Feistel cipher, the theoretical 
evaluation of the upper bound of differential and linear characteristic probability 
becomes more complicated, but the speed under the same level of “real” security 
is expected to be improved. See Sect. 6 for detailed discussions on security. 



4.2 P-Function 

The design rationale of the P-function is similar to that of the P-function of 
E2 [16]. That is, for computational efficiency, it should be represented using only 
bytewise XORs and for security against differential and linear cryptanalyses, its 
branch number should be optimal. From among the linear transformations that 
satisfy these conditions, we chose one considering highly efficient implementation 
on 32-processors [3] and high-end smart cards, as well as 8-bit processors. 

4.3 s-Boxes 

As the s-boxes we adopted functions affine equivalent to the inversion function 
in GF(2®) for enhanced security and small hardware design. 

There is a function affine equivalent to the inversion function in GF(2®) that 
achieves the best known of the maximum differential and linear probabilities, 
2“®. We choose this kind of functions as s-boxes. Moreover, the high degree of 
the Boolean polynomial of every output bit of the s-boxes makes it difficult to 
attack Gamellia by higher order differential attacks. The two affine functions 
that are performed at the input and output of the inversion function in GF(2®) 
complicates the expressions of the s-boxes in GF(2®), which is expected to make 
interpolation attacks ineffective. Making the four s-boxes different slightly im- 
proves security against truncated differential cryptanalysis [23]. 

For small hardware design, the elements in GF(2®) can be represented as 
polynomials with coefficients in the subfield GF(2'*’). In other words, we can 
implement the s-boxes by using a few operations in the subfield GF(2'^) [22]. 
Two affine functions at the input and output of the inversion function in GF(2®) 
also play a role in complicating the expressions of the s-boxes in GF(2^). 

4.4 FL- and F'i” ^-Functions 

FL- and ^-functions are “inserted” between every 6 rounds of a Feistel 
network to provide non-regularity across rounds. One of the goals for such a 
design is to thwart future unknown attacks. It is one of merits of regular Feistel 
networks that encryption and decryption procedures are the same except for the 
order of the subkeys. In Gamellia, FL/FL“ ^-function layers are inserted every 
6 rounds, but this property is still preserved. 

The design criteria of these functions are similar to those of the FL-function 
of MISTY [20] . The difference between MISTY and Gamellia is the addition of 
l-bit rotation. This is expected to make bytewise cryptanalysis harder, but it 
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has no negative impact on hardware size or speed. The design criteria are that 
these functions must be linear for any fixed key and that their forms depend on 
key values. Since these functions are linear as long as the key is fixed, they do 
not make the average differential and linear probabilities of the cipher higher. 
Moreover, these functions are fast in both software and hardware since they are 
constructed by logical operations such as AND, OR, XOR, and rotations. 



4.5 Key Schedule 

The design criteria of the key schedule are as follows. 

1. It should be simple and share part of its procedure with encryption (and 
decryption) . 

2. Subkey generation for 128-, 192- and 256-bit keys can be performed by using 
the same key schedule (circuit). Moreover, the key schedule for 128-bit keys 
can be performed by using a part of this circuit. 

3. Key setup time should be shorter than encryption time. In cases where large 
amounts of data are processed with a single secret key, the setup time for key 
scheduling may be unimportant. On the other hand, in applications in which 
the key is changed frequently, key agility is a factor. One basic component 
of key agility is key setup time. 

4. It should support on-the-fly subkey generation. 

5. On-the-fly subkey generation should be computable in the same way in both 
encryption and decryption. Some ciphers have separate key schedules for 
encryption and decryption. In other ciphers, e.g., Rijndael or Serpent, sub- 
keys are computable in the forward direction only and require unwinding for 
decryption. 

6. There should be no equivalent keys. 

7. There should be no related-key attacks or slide attacks. 

Criteria 1 and 2 mainly address small hardware requirements. Criteria 3, 4, 
and 5 are advantageous in terms of practical applications, and Criteria 6 and 7 
are for security. 

The memory requirement for generating subkeys is quite small. An efficient 
implementation of Camellia for 128-bit keys requires 16 bytes (=128 bits) for the 
original secret key, Kl, and 16 bytes (=128 bits) for the intermediate key, Ka- 
Thus the required memory is 32 bytes. Similarly, an efficient implementation of 
Camellia for 192- and 256-bit keys needs only 64 bytes. 



5 Security 

This section discusses the security of Camellia. Hereafter, we call Camellia with- 
out FL- and ^-functions Camellia*. 
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5.1 Differential and Linear Cryptanalysis 

The most well-known and powerful approaches to attacking many block ciphers 
are differential cryptanalysis [4] and linear cryptanalysis [18]. There are several 
methods of evaluating security against these attacks, where there is a kind of 
“duality” relation between them [19,8]: in other words, the security against both 
attacks can be evaluated in similar ways. 

It is known that the upper bounds of differential and linear characteristic 
probabilities can, for several block ciphers, be estimated using the minimum 
numbers of differential and linear active s-boxes in some consecutive rounds, 
respectively. Kanda [11] shows the minimum numbers of differential and linear 
active s-boxes for Feistel ciphers with conservative SPN (S-P) round function. 

Definition 1. ([25]) The branch number B of linear transformation P is defined 
by 

B = min(u>H(a;) + u>H(P(a:))), 
where 1 ^ 11 ( 2 :) denotes the bytewise Hamming weight of x. 

Theorem 1. The minimum number of differential/linear active s-boxes in any 
eight consecutive rounds is equal or larger than 2B 1. 

Theorem 2. Let ps and Qs be the maximum differential and linear probabilities 
of all s-boxes, and T> and C be the minimum numbers of total differential and 
linear active s-boxes, respectively. Then, the maximum differential and linear 
characteristic probabilities are bounded by p^ and qf, respectively. 

In the case of Camellia, the maximum differential and linear probabilities of 
the s-boxes are Ps = qs = 2“®. The branch number of the linear transformation 
(P-function) is 5, i.e., B = b. Letting p, q be the maximum differential and linear 
characteristic probabilities of Camellia* reduced to 16-round, respectively, we 
have p < = 2“^^^ and q < = 2“^^^ from 

Theorems 1 and 2. Both probabilities are below the security threshold of 128-bit 
block ciphers: 2“^^®. It follows that there is no effective differential characteristic 
or linear characteristic for Camellia* reduced to more than 15 rounds. Since FL- 
and PP“^-functions are linear for any fixed key, they do not make the average 
differential and linear probabilities of the cipher higher. Hence, it is proven that 
Camellia offers enough security against differential and linear cryptanalyses. 

Note that the result above are based on Theorems 1 and 2. Both theorems 
deal with general cases of Feistel ciphers with SPN round function, so we ex- 
pect that Camellia is actually more secure than shown by the result above. 
As supporting evidence, we counted the number of active s-boxes of Camellia 
and Camellia* with reduced rounds. The counting algorithm is similar to that 
described in [21] except following three items. 

— Prepare the table for the number of active s-boxes instead of transition 
probability table. 
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Table 3. Upper bounds of differential characteristic probability of Camellia 



# of rounds 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


Based on 






2-12 


2-30 




2-42 




2-66 








2-96 


Th. 1 and 2 






(2) 


(5) 




(7) 




(11) 








(16) 


Camellia 


1 


2-6 


2-12 


2-42 


2-54 


2-66 


2-72 


2-72 


2-78 


2-108 


2-120 


2-132 




(0) 


(1) 


(2) 


(7) 


(9) 


(11) 


(12) 


(12) 


(13) 


(18) 


(20) 


(22) 


Camellia* 


1 


2-6 


2-12 


2-36 


2-54 


2-66 


2-78 


2-90 


2-108 


2-126 


2-132 






(0) 


(1) 


(2) 


(6) 


(9) 


(11) 


(13) 


(15) 


(18) 


(21) 


(22) 





Note: The numbers in brackets are the number of active s-boxes. 



Table 4. Upper bounds of linear characteristic probability of Camellia 



^ of rounds 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


Based on 






2-12 


2-30 




2-42 




2-66 








2-96 


Th. 1 and 2 






(2) 


(5) 




(7) 




(11) 








(16) 


Camellia 


1 


2-6 


2-12 


2-36 


2-54 


2-66 


2-72 


2-72 


2-78 


2-102 


2-120 


2-132 




(0) 


(1) 


(2) 


(6) 


(9) 


(11) 


(12) 


(12) 


(13) 


(17) 


(20) 


(22) 


Camellia* 


1 


2-6 


2-12 


2-36 


2-54 


2-66 


2-78 


2-84 


2-108 


2-120 


2-132 






(0) 


(1) 


(2) 


(6) 


(9) 


(11) 


(13) 


(14) 


(18) 


(20) 


(22) 





Note: The numbers in brackets are the number of active s-boxes. 



— Count the number of active s-boxes instead of computing transition proba- 
bility. 

— FL- and ^-functions set all elements to the minimum number of ac- 
tive s-boxes in the table. This means that the algorithm gives consideration 
to existence of weak subkeys inserted to FL- and FL“^-fuirctions, since 
there may be some possibility of comrectiirg every later differeirtial and lin- 
ear characteristic with the previous oire with the highest probability, which 
is equivaleirt to the miirimum irumber of active s-boxes. 

As a result, we confirmed that 12-rouird Camellia has no differential and 
linear characteristic with probability higher than 2“^^® (see Tables 3 and 4). 

5.2 Truncated Differential and Linear Cryptanalysis 

The attacks using truncated differentials were introduced by Knudsen [13]. He 
defined them as differeirtials where only a part of the difference can be predicted. 
The notioir of truncated differentials introduced by him is wide, but with a 
byte-orieirted cipher it is natural to study bytewise differentials as truircated 
differeirtials [23]. 

The maximum differential probability is considered to provide the strict eval- 
uation of security against differential cryptanalysis, but computing its value is 
impossible in general, since a differential is a set of all differential characteristics 
with the same input difference and the same output difference for a Markov ci- 
pher [17]. On the other hand, a truncated differential can be regarded as a subset 
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of the differential characteristics which are exploitable in cryptanalysis. For some 
ciphers, e.g., byte-oriented ciphers, the probability of truncated differential can 
be computed easily and correctly, and it gives a more strict evaluation than the 
maximum differential characteristic probability. 

A truncated differential cryptanalysis of reduced-round variants of E2 was 
presented by Matsui and Tokita at FSE’99 [23]. Their analysis was based on 
the “byte characteristic,” where the values to the difference in a byte are distin- 
guished between non-zero and zero. They found a 7-roimd byte characteristic, 
which leads to a possible attack on an 8-round variant of E2 without IT-Function 
(the initial transformation) and FT-Function (the final transformation). The 
best attack of E2 shown in [24] breaks an 8-round variant of E2 with either IT- 
Function or FT-Function using 2®^ chosen plaintexts. In [24] we also show the 
attack which distinguishes a 7-round variant of E2 with IT- and FT-Functions 
from a random permutation using 2®^ chosen plaintexts. 

Camellia is a byte-oriented cipher similar to E2, and it is important to eval- 
uate its security against truncated differential cryptanalysis. We searched for 
truncated differentials using an algorithm similar to the one described in [23,24]. 
The main difference of the round function between E2 and Camellia is the adop- 
tion of the 1-round SPN not the 2-round SPN, i.e., S-P-S. In the search for 
truncated differentials of E2, we used about 2“® as the probability of difference 
cancellation in one byte at the XOR of Feistel network. However, the round 
function of Camellia doesn’t have the second s-boxes-layer, and the difference 
cancellation in plural bytes sometimes occurs with the same probability. Accord- 
ingly, we changed the difference cancellation rule at the XOR of Feistel network 
in the search algorithm. As a result. Camellia with more than 10 rounds is 
indistinguishable from a random permutation, in both cases with/without FL- 
/FF“ ^-function layers. 

Next, we introduce a new cryptanalysis called truncated linear cryptanalysis. 
Due to the duality between differential and linear cryptanalyses, we can evaluate 
security against truncated linear cryptanalysis by using a similar algorithm to 
that above. To put it concretely, we can perform the search by replacing the 
matrix of F-function with the transposed matrix. As a result. Camellia* with 
more than 10 rounds is indistinguishable from a random permutation. 



5.3 Boomerang Attack 

Boomerang attack [26] requires two differentials. Let the probability of the dif- 
ferentials be PA and pv- An boomerang attack that is superior than exhaustive 
key search requires 

PAP\j > 2"^"^. (1) 

Using Table 3, there is no combination that satisfies (1) for Camellia*. The 
best boomerang probability for Camellia* reduced to 8-round is bounded by 2“®® 
that is obtained by pA = 2“^^ (3 rounds) and pv = 2“®"^ (5 rounds). Since the 
attackable rounds is bounded by much shorter than the specification of Camellia, 
18 or 24, Camellia seems secure against a boomerang attack. 
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Table 5. Smallest number of unknown coefficients for 128-, 192-, and 256-bit keys 



5.4 Higher Order Differential Attack 

Higher order differential attack is generally applicable to ciphers that can be 
represented as Boolean polynomials of low degree. All intermediate bits in the 
encryption process can be represented as Boolean polynomials, i.e., polynomials 
GF(2)[a;i, X 2 , . . . ,Xn] in the bits of the plaintext: {xi,X 2 , ■ ■ ■ ,x„}. In the higher 
order differential attack described in [10, Theorem 1], if the intermediate bits 
are represented by Boolean polynomials of degree at least d, the (d-l- l)-th order 
differential of the Boolean polynomial becomes 0. 

For the degrees of Boolean polynomials of the s-boxes of Camellia, the func- 
tions affine equivalent to the inversion function in GF(2®) are adopted as the 
s-boxes. We confirmed that the degree of the Boolean polynomial of every out- 
put bit of the s-boxes is 7 by finding Boolean polynomial for every output bit 
of the s-boxes. In Camellia, it is expected that the degree of an intermediate 
bit in the encryption process increases as the data pass through many s-boxes. 
For example, the degree becomes 7^ > 128 after passing through three s-boxes. 
Therefore, we expect that higher order differential attacks fail against Camellia 
with full rounds. 

5.5 Interpolation Attack and Linear Sum Attack 

The interpolation attack proposed in [10] is typically applicable to attacking 
ciphers that use simple algebraic functions. Linear sum attack [2] is a general- 
ization of the interpolation attack. 

A practical algorithm that evaluates the security against linear sum attack 
was proposed in [2] . We searched for linear relations between any plaintext byte 
and any ciphertext byte over GF(2®) using the algorithm. Table 5 summarizes the 
results, and shows that Camellia is secure against linear sum attack including 
interpolation attack. It also implies that Camellia is secure against Square 
attack [9] followed by [2, Theorem 3]. 

5.6 Security of Key Schedule 

No Equivalent Keys: Since the set of subkeys generated by the key schedule 
contain the original secret key, there is no equivalent set of subkeys generated 
from distinct secret keys. Therefore, we expect that there are no distinct secret 
keys both of which encrypt each of many plaintexts into the same ciphertext. 

Slide Attack: In [6,7] the slide attacks were introduced, based on earlier work 
in [5,12]. In particular it was shown that iterated ciphers with identical round 
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functions, that is, equal structures and equal subkeys in the round functions, are 
susceptible to slide attacks. 

In Camellia, FL- and ^-functions are “inserted” between every 6 rounds 
of a Feistel network to provide non-regularity across rounds. Moreover, from the 
viewpoint of the key schedule, slide attacks seems to be very unlikely to succeed. 

Related-Key Attaek: We are convinced that the key schedule of Camellia makes 
related-key attacks [5,15] very difficult. In these attacks, an attacker must be 
able to get encryptions using several related keys. However, since the subkeys 
depend on Ka and Kg, which are the results of encryption of a secret key, and 
if an attacker wants to change the secret key, he can’t get Ka and Kg desired, 
and vice versa, these subkey relations will be very hard to control and predict. 

6 Performance 

6.1 Software Implementations 

Table 6 summarizes the current software implementations of Camellia. The table 
shows that Camellia can be efficiently implemented on low-end smart cards, and 
32-bit and 64-bit processors. We use the abbreviations M (mega) for 10® and m 
(milli) for 10“® in the table. 

6.2 Hardware Performance 

We measured the hardware performance of Camellia for 128-bit keys on ASIC 
(Application Specific Integrated Circuit) and FPGA (Field Programmable Gate 
Array). Table 7 shows the environment of our hardware design and evaluation. 
We evaluated hardware performance of the three types: Type 1, Type 2 and 
Type 3 logic. The hardware design policy of each type is as follows. 

Type 1 Fast implementation from the viewpoint of encryption speed 
Type 2 Small implementation from the viewpoint of total logic size 
Type 3 Small implementation (special case for FPGA) 

Tables 8 through 11 summarize the hardware performance of Camellia for 
128-bit keys on ASIC and FPGA. 

7 Conclusion 

We have presented Camellia, the rationale behind its design, its suitability for 
both software and hardware implementation, and the results of our cryptanaly- 
ses. For further information, please refer to the specification of Camellia [1] or full 
paper, which are available on the Camellia home page: http : // info . isl . ntt . 
CO . jp/camellia/. 

The performances shown in this paper leave room for further optimizations. 
The latest performance results will be posted on the Camellia home page. 
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Table 6. Camellia software performance 



Processor 


Lang. 


Key 

[bits] 


Timing [cycles] 
Setup** ('*) Enc." (*^) 


Dynamic [bytes] 
Setup** Enc.** 


Code [bytes] 
Setup** Enc.** 


Table 

[bytes] 


P IIP 


Asm 


128 


160 (4.4M) 371 (242M) 


28 36 


1,046 2,150 


8,224 






192 


222 (3.2M) 494 (181M) 


28 36 


1,469 3,323 


8,240 






256 


226 (3.1M) 494 (181M) 


28 36 


1,485 3,323 


8,240 


pTF 


C® 


128 


263 (I.IM) 577 (67M) 


44 64 


1,600 3,733 


4,128 


Alpha'* 


Asm 


128 


118 (5.7M) 339 (252M) 


48 48 


1,132 3,076 


16,528 






192 


176 (3.7M) 445 (192M) 


48 48 


1,668 4,000 


16,528 






256 


176 (3.7M) 445 (192M) 


48 48 


1,676 4,000 


16,528 






128 


158 (4.2M) 326 (262M) 


48 48 


1,600 2,928 


16,512 


8051* 


Asm 


128 


0 (0) 10217 (10m) 


0 32 


0 702 


288 



Key schedule may be included. 

Seconds for 8051, and keys/s for other processors. 

Numbers of this column is the same as decryption. 

Seconds for 8051, and b/s for other processors. 

Intel Pentium III (700MHz), 256KB on-die L2 cache, FreeBSD 4. OR, 128MB main memory. 
Intel Pentium II (300MHz), 512KB L2 cache, MS-Windows 95, 160MB main memory. 

ANSI C, Microsoft Visual CH — h 6 with the optimization options /G6 /Zpl6 /ML /Ox /0b2. 
Alpha 21264 (667MHz), Compaq Tru64 UNIX 4. OF, 2GB main memory. 

Intel 8051 (12MHz; 1 cycle = 12 oscillator periods) simulator on Unix. 



a 

b 

c 

d 

e 

f 

9 

h 

i 



Table 7. Hardware evaluation environment (ASIC, FPGA) 



Language 


(ASIC, FPGA) Verilog-HDL 


Simulator 


(ASIC, FPGA) Verilog-XL 


Design library 


(ASIC) Mitsubishi Electric 0.35^ CMOS ASIC library 
(FPGA) Xilinx XG4000XL series 


Login synthesis 


(ASIG) Design Compiler version 1998.08 

(FPGA) Synplify version 5.3.1 and ALLIANCE version 2.1i 



Table 8. Hardware performance (Type 1: [ASIC(0.35/r CMOS)]) 



Algorithm 

name 


Enc.&Dec.** 


Area [Gate] 
Key expan. '* 


Total logic** 


Key setup 
time [ns] 


Critical- 
path [ns] 


Throughput 

[Mb/s] 


DES 


42,204 


12,201 


54,405 


— 


55.11 


1161.31 


Triple-DES 


124,888 


23,207 


128,147 


— 


157.09 


407.40 


MARS 


690,654 


2,245,096 


2,935,754 


1740.99 


567.49 


225.55 


RC6 


741,641 


901,382 


1,643,037 


2112.26 


627.57 


203.96 


Rijndael 


518,508 


93,708 


612,834 


57.39 


65.64 


1950.03 


Serpent 


298,533 


205,096 


503,770 


114.07 


137.40 


931.58 


Twofish 


200,165 


231,682 


431,857 


16.38 


324.80 


394.08 


Camellia 


216,911 


55,907 


272,819 


24.36 


109.35 


1170.55 



including output registers 
including subkey registers 
including buffers for fan-out adjustment 
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Table 9. Hardware performance (Type 2: [ASIC(0.35/r CMOS)]) 



Algorithm 

name 


Area [Gate] 

Enc.&Dec.“ Key sched.** Total logic 


Key setup 
time [ns] 


Gritical- 
path [ns] 


Throughput 

[Mb/s] 


Camellia 


6,367 4,979 11,350 


110.2 


27.67 


220.28 



including output registers and data selector 

including subkey registers and a part of key expansion logic 

including buffers for fan-out adjustment 



Table 10. Hardware performance (Type 2: [FPGA(XC4000XL series)]) 



Algorithm I Total Area [CLBs] j Critical-path [ns] j Throughput [Mb/s]| 
Camellia | 1,296 1 78.815 ^ 77.34 I 



Table 11. Hardware performance (Type 3: [FPGA(XC4000XL series)]) 



Algorithm [Total Area [GLBs] [Gritical-path [ns] [Throughput [Mb/s][ 


Gamellia 


874 


49.957 


122.01 [ 



We have analyzed Camellia and found no important weakness. The cipher has 
a conservative design and any practical attacks against Camellia would require 
a major breakthrough in the area of cryptanalysis. We think that Camellia is a 
very strong cipher, which matches the security of the existing best block ciphers. 
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Abstract. The development process of the Advanced Encryption Stan- 
dard (AES) was launched in 1997 by the US government through NIST. 
The Decorrelated Fast Cipher (DFC) was the CNRS proposal for the 
AES, among 14 other candidates in 1998. It was based on the recent 
decorrelation theory, to obtain certain security proofs covering linear and 
differential cryptanalysis. DFC received numerous comments. In particu- 
lar, Coppersmith discovered a weakness in the key schedule. We address 
this weakness by a slight modification on DFC. This paper presents the 
specifications and rationales of DFC version 2, and discusses issues raised 
during the AES process. 



1 Introduction 

A major goal in cryptography is to prove security statements on encryption 
schemes. To this respect, it is well-known that the status of secret-key cryptog- 
raphy is quite different from that of public-key cryptography. The decorrelation 
theory was introduced in 1998 (see [20] for the original reference) as an attempt 
towards filling this gap, by providing new ideas to build block ciphers, together 
with security proofs covering certain (however general) classes of attacks. Since 
the AES process was launched by NIST at about the same period, the French 
National Center for Scientific Research (CNRS) decided to start a project aimed 
at showing that decorrelation theory was a reasonable proposal for making se- 
cure and efficient block ciphers. The target platform was chosen to be 64-bit 
microprocessors, as such chips are likely to become standard during the lifetime 
of the AES. The CNRS project gave birth to the “Decorrelated Fast Cipher” 
(DFC) [6,7]. 

Decorrelation theory (see [20,21,22,23,24,25]) enables to prove formal results 
on the security of cryptographic primitives under certain hypotheses which we 
believe to be realistic. In particular, it enables to quantify the best advantage 
to distinguish two families of block ciphers, for a class of attacks with limited 
resources. For instance, one can consider any Turing machine restricted to a 
given number d of oracle calls to the block cipher. Most of the existing block 
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ciphers are provably secure for the d = I case. However, none addresses the 
d = 2 case, except DFC and other decorrelation theory-based ones. Interest- 
ingly, the d = 2 case already provides formal security against possible formaliza- 
tions of differential and linear cryptanalysis. The Nyberg-Knudsen approach [16] 
was the only previously known way to achieve similar security statements (with 
MISTY [13,14] as a famous example.) The MISTY approach however does not 
provide much design flexibility, and the DFC approach seems to achieve stronger 
results as shown in Section 4. Besides, the Nyberg-Knudsen approach is indeed 
an ad hoc construction for providing security against differential and linear at- 
tacks but does not consider other general attacks with d= 2. 

Implementing decorrelated block ciphers with order d = 2 by using known 
techniques (like the PEANUT construction [20]) requires the use of built-in 
multiplication which leads to non-trivial optimization tricks. DFC was submitted 
to the AES in order to show that such challenges could be overcome. DFC 
attracted many comments from the AES community, sometimes controversial. 
For instance, it was claimed that DFC was too slow, that its security paradigm 
brought nothing new, and that the security margin was too small. In addition. 
Coppersmith discovered a weakness in the key schedule by showing the existence 
of a fraction of 2“^^® of weak keys (using a quite complex algorithm). 

In this paper, we give the complete specifications of DFCv2. This new version 
addresses the key schedule problem and allows scalable modifications of the 
internal structure (so that the user can choose any “security margin”). We also 
try to respond to the issues raised on the original DFC. 

2 Specifications of DFCv2 

In this section, we give the complete specifications of DFCv2, and emphasize 
rationales in each subsection. A sample test vector for the nominal choices of 
the parameters is given in Appendix. 

2.1 Notation 

All quantities are bit strings or integers. When string lengths are divisible by 
four, quantities are denoted in hexadecimal. For instance, d43x denotes the bit- 
string 110101000011 and also represents the (decimal) integer 3395 in arithmetic 
operations. We use classical bitwise bitstring operations: OR, AND, NOT, XOR. 
We also use the following arithmetic operations over the integers: -I-, x , mod. The 
result of an arithmetic operation is implicitly converted into a bitstring whose 
length will be clear from the context. Finally, we use the bitstring concatenation 
j and the trunc„ function that extracts the n leftmost bits of a bitstring. 

2.2 High Level Overview 

DFCv2 is characterized by four parameters m, k, r and s chosen for security and 
efficiency reasons. In DFCv2(m, fc, r, s), m is the message block length, k is the 
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key length, r is the number of encryption rounds, and s is the number of rounds 
for the subkey generation. We require that m > 32, 0 < k < 2m, rs < 128, m 
is a multiple of 4, and r is even. The nominal choice for DFCv2 is m = 128, 
k e {128, 192, 256}, r = 8 and s = 4. 

The encryption function DFC/c operates on m-bit message blocks by means of 
a secret key K of arbitrary length k up to 2m bits. The corresponding decryption 
function is DFC}}^ and operates on m-bit message blocks. 

The secret key K is first turned into an mr-bit “Expanded Key” EK through 
an “Expanding Function” EF, i.e. EK = EF(iF). As explained in Section 2.5, the 
EF function applies r s-round Feistel schemes (see Feistel [5]). The encryption 
process itself performs a similar r-round Feistel scheme. Each round uses the 
“Round Function” RF. This function maps a y-bit string onto a ^-bit string 
by using one m-bit string parameter. It is defined in Section 2.3. 

Given a bitstring a of length multiple of m, say mp, we split it into p m-bit 
strings 

cr = Pi\P2\---\Pp- 

From (T we define a permutation EnCo- on the set of m-bit strings coming from 
an p-roimd Feistel scheme. For any m-bit string PT which is split into two ^-bit 
halves xq and xi so that PT = xq\x\. We build a sequence xq, . . . , Xp+i by the 
equation 

Xi+i = RFp.(xi) XOR Xi_i (i=l,...,p) (1) 

and we define EnCcr(m) = Xp+i|xp. 

Given an m-bit plaintext block PT and the mr-bit expanded key EK, the 
DFGv2if encryption function is obtained as 

DFGv2if = Eucek (2) 

(that is, an r-round Feistel Gipher). 

The EF function uses an s-round version defined with Enc. 

If we split EK into r m-bit strings 



EK = RKi|RK2|.. 


. \RKr 


(3) 


obviously, we have DFG)}^ = EnCj-evEK where 
revEK = RKr|RK^_i| 


...|RKi. 


(4) 



2.3 The RF Function 

The RF function (as for “Round Function”) is fed with one m-bit parameter, 
which we view as two y-bit parameters: an “a-parameter” and a “5-parameter”. 
It processes a ^-bit input x and outputs a y-bit string defined as follows: 

RF„|f,(x) = GP (((a X X -|- 5) mod p) mod 2^) (5) 

where GP is a permutation over the set of all ^-bit strings (which appears in 
Section 2.4) and p is the smallest prime integer greater than 2^. For instance, 
if m = 128, we use p = 2®^ -|- 13. See the following table for other values. 
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m p 
32 2^® + ! 

64 232 + 15 
96 248 + 21 
128 2®4 + 13 

Following the PEANUT scheme paradigm (see [20]), the RF function imple- 
ments a decorrelation module. R is basically made from a classical round function 
(with CP), and from the pairwise decorrelation module x i— >• {ax + b mod p) mod 
2^ which was used in the PEANUT construction. 

From this construction. Decorrelation Theory ensures that if we consider 
DFCv2{128,k,6, s) and if we make the heuristic assumption that EK is random 
and uniformly distributed from the random choice of the secret key, then the best 
advantage for distinguishing this reduced and idealized version of DFCv2 from 
a truly random permutation when limited to two chosen plaintexts is less than 
2“4i'' (ggg [24]). This property has several consequences on the formal security 
of DFCv2 as summarized in Section 4- 

2.4 The CP Permutation 

The CP permutation (as for “Confusion Permutation”) uses a look-up table RT 
(as for “Round Table”) which takes a 6-bit integer as input and provides a ™-bit 
string output. Its size is thus 2m bytes. 

Let y = yi\yr be the input of CP where yi and yr are two ^-bit strings. We 
define 

CP(y) = {{yr XOR (RT o trunc6)(y0) l(2/i XOR KC)) +KD mod 2 ^ (6) 

where KC is a ^-bit constant string, and KD is a ^-bit constant string. The 
permutation CP is depicted in Fig. 1. 

The constants RT(0), . . . ,RT(63), KC and KD will be set in Section 2.6. 

The purpose of CP is to implement a permutation over all ^-bit strings which 
breaks the algebraic structure of the decorrelation module. For this we use a 
mixture of XORs and additions in a way very similar to that of the RC5 block 
cipher [19] . 

The RT tables play an important role by introducing randomness. These tables 
are limited to 2m bytes in total (in order to fit to embedded hardware with low 
memory) but with a maximal input size. 

2.5 Key Scheduling Algorithm 

In order to generate a sequence RKi,RK 2 , . . . ,RKr from a given key K repre- 
sented as a bit string of length at most 2m, we use the following algorithm. We 
first pad K with a constant pattern KS in order to make a 2m-bit “Padded Key” 
string by 



PK = trunc 2 m(K|KS). 



(7) 
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If K is of length m, we can observe that only the first m bits of KS are used. 
We define KS of length 2m in order to allow any key size from 0 to 2m. 

Then we split PK into two m-bit strings RKq and IRKq (as for “Internal 
Round Key”) such that PK = IRKo|RKp We assume we are given 16 m-bit 
constants KABq, . . . , KAB 15 We now define 



IRKj+i = IRKj XOR 



f KABrtO) mod 16 if j < 64 

\ KAB(R^-r(j-64)>>8) mod 16 Otherwise 



(8) 



for j = 0, 1, . . . , rs — 1 where RT(j — 64) >> 8 denotes the bitstring RT(j — 64) 
logically shifted by 8 bits to the right. Basically, we take the four least significant 
bits of RT(j) for j < 64 and some other four bits of RT(j — 64) for 64 < j < 128. 
(Since we require that rs < 128, j is less than 128.) We notice that IRK^ is 
actually the XOR of IRKp with some constant depending on j. 

Each sequence of s IRK^ values defines an sm-bit string lEK^ which serves 
as the round key sequence of some s-round internal encryption function. More 
precisely, we define 



lEK, = IRK,,_,+i I . . . |IRK,,_i |IRK,, (9) 

for i = 1, 2 , . . . , r and 

lEnCi = EnciEKi ( 16 ) 

for i = 1, 2, . . . , r as an “Internal Encryption” . We now define the RK^ sequence 
by 

RK, = IEnc,(RKi_i) (11) 

^ The following IRKi sequence replaces the OAPi|OBPi and EAPi|EBPi sequences 
defined in DFCvl 

^ These constants replace the KAi and KBi sequences defined in DFCvl 
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for i = 1, 2, . . . , r. Finally we define 



EK = EF(iF) = RKilRKal . . . |RK^. 



( 12 ) 



We can start the same process from IRKj.|RKj. instead of PK. This enables 
to decrypt by computing the reversed sequence RK^ “on the fly” . 

This new key schedule repairs two drawbacks which were reported on DFCvl 
(see [2]). Namely, due to the pairwise difference of the IRK^s, the iterations of 
the lEKiS are no longer symmetric which fixes the weak key property reported 
by Coppersmith, and the first round key RKi now depends on all key bits. In 
addition, the RK^ sequence now looks “more random”. 

2.6 On the Definition of the Constants 

The previously defined algorithm depends on several constants: 

— 64 constants RT(0), . . . ,RT(63) of ^ bits (thus forming 16m bits), 

— one y-bit constant KD, 

— one ^-bit constant KC, 

— 16 m-bit constants KABq, . . . , KAB 15 

— one 2m-bit constant KS. 

Those constants must satisfy the following security criterion. 

1. the RT round table has no collision, 

2. KD is odd, 

3. the IRKj are pairwise different for j = 1, . . . , rs 

We will use some constants several times. Actually, the RT table, KC and KD 
will contain the other constants. We thus need 18m bits of random constants. 

In order to convince that this design hides no trap-door, we choose the con- 
stants from the hexadecimal expansion of the mathematical e constant 



We use the following scheme in order to define the constants. 

Step 1. Let EES (as for “e Expansion String”) be the first 18m bits of the 
expansion of e after the (hexa)decimal point, we define 



Note that when this criterion is satisfied for one key, it is satisfied for any key 




(13) 



trunce^^(EES) = RT(0)|RT(1)| . . . |RT(63)|KD|KC. (14) 
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Here is the EES string for m = 128. 

b7el5162 8aed2a6a bf715880 9cf4f3c7 62e7160f 38b4da56x 
a784d904 5190cfef 324e7738 926cfbe5 f4bf8d8d 8c31d763x 
da06c80a bbll85eb 4f7c7b57 57f59584 90cfd47d 7cl9bb42x 
158d9554 f7b46bce d55c4d79 fd5f24d6 613c31c3 839a2ddfx 
8a9a276b cfbfalcS 77c56284 dab79cd4 c2b3293d 20e9e5eax 
f02ac60a cc93ed87 4422a52e cb238fee ebabSadd 835fdla0x 
753d0a8f 78e537d2 b95bb79d 8dcaec64 2cle9f23 b829b5c2x 
780bf387 37df8bb3 00d01334 a0d0bd86 45cbfa73 a6160ffex 
393c48cb bbca060f 0ff8ec6d 31beb5cc eed7f2f0 bb088017x 
163bc60d f45a0ecb lbcd289b OScbbfea 21ad08el 847f3f73x 
78d56ced 94640d6e f0d3d37b e67008el 86dlbf27 5b9b241dx 
eb64749a 47dfdfb9 6632c3eb 061b6472 bbf84c26 144e49c2x 

Step 2. We use the following algorithm to enforce the first two security criteria. 

1. for i = 0 to 63 do 

(a) while there exists 0 < j < i such that RT(j) = RT(i), replace RT(i) by 
RT(i) + 1 mod 2 t . 

2. if KD is even, replace KD by KD + 1. 

3. change the EES string accordingly so that Equation (14) holds. 

Step 3. From this EES string we now define 



EES = KABo|...|KABi5|KS. (15) 

Note that the third security criterion is necessarily satisfied, otherwise we would 
have collisions in RT. 

At the end of the algorithm, we obtain a constant EES string depending on 
the parameters and which comes from the expansion of e and all the defined 
constants. We notice that for m = 128 all criteria are satisfied when EES is 
equal to the original expansion string of e (written in hexadecimal as above). 
For large to, it is highly unusual that we have to change it (but for KD with 
probability 1/2). 



3 Benchmarks and Implementations 

Straightforward implementations of DEC are quite slow on 32-bit micropro- 
cessors for the nominal choices of parameters, due to the critical operation 
ax -\- b mod 2®^ -|- 13. Efficient implementations require non trivial tricks. That 
is why the original implementation of DFCvl, which was bound to NISTs re- 
quirements (namely, ANSI-C implementation, which restricts to 32-bit words 
and prohibits the use of the 32-bit times 32-bit — > 64-bit multiplication of most 
processors), was quite slow and actually slower than most other candidates, 
especially since it dealed with endianess as well. The ANSI-C implementation 
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required 3600 clock cycles per encryption (without key setup) on a Pentium Pro. 
This should be compared with the 392 clock cycles on the same processor using 
assembly language and processor specific tricks. Further implementation tricks 
(which were summarized by Noilhan [15]) and clever use of specific architectures 
of microprocessors have shown that DFC was among the fastest AES candidates, 
and notably the fastest one on ALPHA 64-bit microprocessors (310 clock cycles 
per encryption without the key setup, on an ALPHA 21164a in assembly code^). 

DFCv2 does not introduce important implementation differences from 
DFCvl for the nominal choice of the parameters. More precisely, only the key 
schedule has changed, and even the complexity of the key setup has not changed 
(it roughly takes four basic encryptions). 



4 Security Analysis 

4.1 Provable Security Results 

We state the security results in terms of the new parameters (m, fc, r, s). 

Ideal key sehedule. We recall that the security results consist, firstly of theoretical 
results for an ideal extension of DFCv2 in which the RK^ sequence is assumed to 
be uniformly distributed (we will call DFCv2*(m,r) this ideal algorithm which 
does not depend on k or s), secondly of some practical results on the real DFCv2 
algorithm in which we have to make a heuristic assumption stated below. 

Theorem 1 ([24]). The best advantage of an attack limited to two adaptively 
chosen plaintexts for distinguishing DFCv2* (m,r) from a uniformly distributed 
random permutation is bounded by 



Bes^tAdv(DFCv2*(m,r),C*) < ^3 ((2^)' " l) + 2¥) 

where p is the smallest prime number greater than 2t . 

If we let p = 2's' (1 -|- 5), the previous upper bound can be approximated by 

i (6(5 + 23-f))LSJ . (17) 

This shows that the best advantage is negligible against 2“™ if r > 9 when the 
attack is limited to two chosen plaintexts (i.e. in the d = 2 case). For m = 128, 
we have 6 = 13.2“®^ and we get back the bound of DFCvl 

BestAdv(DFCv2*(128,r),C*) < (18) 

Cla 2 , 

Implementation due to Robert Harley, see [8]. See also [1,15] 
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From the decorrelation theory we know that the security against any attack 
limited to two chosen plaintexts implies the security against some reasonable 
formalization of differential and linear cryptanalysis (see [20]). Namely, the av- 
erage complexity of differential cryptanalysis (over the distribution of the keys) 
needs at least to be within the order of l/4BestAdv, as for the linear crypt- 
analysis (from an asymptotic bound). In this context, for instance, differential 
cryptanalysis can be formalized into: 

1. pick a differential characteristic (a, 6) 

2. query an input pair of difference a until the corresponding output pair has 
a difference of b 

It is well-known that this formalization is the core of regular differential crypt- 
analysis [3]. For instance, 2R attacks apply such a procedure on r — 2 rounds. 
Since we can claim that the differential cryptanalysis core against DFCv2*(128, 
6) has a complexity of 2^^®, we can thus claim that DFCv2*(128, 8) is secure 
against a 2R differential cryptanalysis up to a complexity of 2^^^. 

Similarly, the average complexity of any known plaintext coming from an 
iterated attack of order one (i.e. an iterated attack in which each iteration ex- 
tracts one bit of information from one known plaintext/ciphertext pair) needs 
to be at least within the order of l/2-\/BestAdv (see [22]). 

More precisely, we recall the following result: 

Theorem 2 ([20,22]). For any differential distinguisher of complexity n 
against DFCv2*{m,r), the advantage Advo is such that 

Ti 

Adv/j < nBestAdv -I- (19) 

where BestAdv is bounded by Equation (16). Similarly, for any linear distin- 
guisher we have 

1 

lim < 9.3 (dBestAdv + ^ \ . (20) 

n->-|-oo „3 2"* — 1 / 

For any known plaintext iterated distinguisher of order 1 we have 

Adv/ < 3 -I- 3BestAdv^ -|- nBestAdv. (21) 

Real key schedule. Since DFCv2 has a new key scheduling algorithm, we need 
to transform the security results on DFCv2* to DFCv2. Let 'D{m, k,r, s) be 
the distribution of (RKi, . . . ,RK^.) spanned by the key scheduling algorithm of 
DFCv2(m, A:, r, s) when AT is a uniformly distributed fc-bit key, and we let T>* 
denote the uniform distribution over rm-bit sequences. DFCv2* relies on the T>* 
distribution, but DFCv2 uses the T> distribution. 

Let F[t{m,k,r,s) be the best advantage of a Turing machine limited to t 
steps for distinguishing 'D(m, k,r, s) from T>* from a single sample {i.e. an rm- 
bit string). {Fit is a heuristic function. We need to assume that for a reasonable 
t, Ht is small.) 
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Theorem 3. If for some class Ck^n of distinguishers limited to a complexity 
oft and n oracle calls, the advantage for distinguishing DFCv2* (m,r) from 
a random permutation is limited to BestAdv, then the advantage for distin- 
guishing DFCv2(m, fc, r, s) from a random permutation in class Cl is limited 
to fc, r, s) + BestAdv where the 0{n) corresponds to the cost of sim- 

ulating DFCv2 on n oracle calls. 

Therefore, assuming that the complexity of a practical attack already includes 
an overestimated cost for simulating the oracle calls (in practice, using an oracle 
costs more than simulating it), then all security results on DFCv2* extend to 
DFCv2 with an advantage offset of Hf. 

For practical t, m > 128, k > 128, s > 4 and r < we conjecture that 
Ht{m, k, r, s) is negligible. 



4.2 Best Attacks 

So far, the best reported attack is Knudsen’s impossible differential attack [9] 
against DFCv2 reduced to six rounds. It requires 2^° chosen plaintexts and a 
complexity of 2^^® encryptions (see [10]). This attack can be compared to a IR 
attack that uses a differential characteristic on 5 rounds (for which the complex- 
ity lower bound indicated by Theorem 2 is of order 2^^ chosen plaintexts). 

Harvey recently reported^ an attack against four rounds which uses the non- 
injective properties of the round functions. 

Another quite strong claim of insecurity is due to Rijmen and Knudsen [10]. 
Basically, they study a key-dependent one-round differential characteristic for 
a modified version of DFC and deduce some insecurity claims. One problem is 
that they use a difference which is not defined by the XOR operation but by the 
mod 2^ difference at the input and by the mod p difference at the output. This 
makes it hard to pile up such kinds of characteristics. 

For instance, Rijmen and Knudsen noticed that if we replace all XORs in the 
round function by regular additions, every single input difference leads to about 
800 possible output differences, one of it with probability 2“^ (with m = 128). 
These mod 2^ output differences translate into XOR output differences within 
a probability related to their Hamming weight (because of carry bits). We can 
thus estimate that the real DFC round function will lead to no key-dependent 
differential probabilities greater than 2“^^. Therefore, we believe the Rijmen- 
Knudsen observation does not imply any insecurity statement for DFCv2. 



5 The DFC Controversy 

The submission of DFCvl to AES led to a controversy which was oriented to- 
wards three arguments which are addressed in the following subsections. 

at the Rump Session of Fast Software Encryption 2000 
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5.1 Speed 

DFCvl was claimed to be among the slowest of the 15 AES candidates, and one 
of the worst for low-cost smart card implementations. 

A fair performance comparison is a really hard task, as was shown by the 
AES conferences [18, section 4]. Timings have been collected by Granboulan [8] 
and Lipmaa [12], and DEC is without any doubt among the 8 fastest candidates 
in software: Crypton, DEC, E2, Mars, RC6, Rijndael, Serpent and Twofish. It is 
even the fastest candidate on architecture that have fast multiplication (Alpha 
and TurboSparc). When compared to the five finalists, DEC can be considered 
as achieving the same performances as Mars on current architectures (but being 
twice as fast on future architectures like Itanium) . The dependence of DEC on 
multiplication can be compared to the dependence of RC6 on data dependent 
rotations. 

In addition, it was shown in [17] that DEC was reasonably implementable 
on very simple embedded microprocessors (such as Motorola 6805 for smart 
cards). DEC does not take as much room on low-cost smart cards as Mars, and 
should have similar performances. On high-end smart cards (StrongARM) DFC 
is probably the fastest of all AES candidates. 

In conclusion, DFC performances are not the best, but they compare very 
well to Mars, which is one of the finalists. 



5.2 Provable Security 

The provable security results were subject to controversy. We believe this was 
due to misunderstanding and we would like to clarify the situation. 

After the DES was proposed, several other block ciphers showed up without 
any formal security argument. The security was essentially empirical: a block 
cipher was secure until someone came up with an attack. Although this approach 
proved very fruitful for promoting research on the analysis of block ciphers, the 
security provided is now debatable since the analysis time of all world experts 
is rather limited. Besides, we note that there were 15 candidates to analyze in 
less than one year, while DES weaknesses were discovered only after 10 years of 
public exposure. 

Another tremendous amount of regular block ciphers use regular “security 
claims”, which essentially consists of heuristic arguments (like the argument 
on Ht we used above for DFCv2). Typically, people argue that we cannot get 
good differential characteristics by regular active S-box counting arguments. This 
paradigm was inherited by the work of Biham and Shamir [3] and Coppersmith’s 
analysis of DES [4] . 

In 1992, Lai and Massey [11] proposed the formal notion of “Markov cipher” 
which characterizes ciphers for which differentials can nicely be piled up. For 
these ciphers we can formally prove the heuristic security arguments against 
differential cryptanalysis on average over the key space. 

Another more formal approach on which seldom block ciphers are based 
(including MISTY [13,14]) is inherited by Nyberg-Knudsen Theorem [16]. It 
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consists of using ad hoc constructions with heavy non-linear constraints on S- 
boxes and deducing that the block cipher has no good differential property on 
average on the key distribution. These results are however limited to differential 
(and linear) attacks. 

Our paradigm obtains similar results to the previous approach in a more 
general setting for basically no cost. It further provides more freedom in the 
construction of the block cipher. Thus, we believe it is a better alternative which 
follows the construction trends. 

One objection by Rijmen and Knudsen [10] argued that since there exist inse- 
cure algorithms for which similar security claims hold, such claims are worthless. 
Indeed, the affine cipher x i— >■ K\x + K 2 has a perfect pairwise decorrelation, 
which means that Theorem 2 holds with BestAdv= 0, and in particular, no 
differential distinguisher gets a relevant advantage. (The differential is chosen 
before the attack itself in this model, so it is independent on the key.) This 
comes from the fact that we can “only” say that the probability of any differen- 
tial is low on average over the key space. Previous formal approaches suffer from 
the same drawbacks. Actually, the Markov cipher approach is quite similar, and 
the Nyberg-Knudsen approach has the same result. As compared to the Nyberg- 
Knudsen approach, the present one holds for regular ciphers (not only to ad 
hoc constructions) . Therefore we claim that DFCv2 benefits from the all regular 
heuristic security arguments and the present formal security proof (which is not 
the case of the affine cipher, nor of any other regular cipher). This suggests that 
DFC has its raison d’etre. 



5.3 Security Margin 

Another criticism against DFC was its low “security margin”. The DFC phi- 
losophy consisted of not overestimating the minimal number of secure rounds 
and committing to the formal results obtained by decorrelation theory. We actu- 
ally believe that for construction reasons, the security increases faster with the 
number of rounds than for other designs. We chose r = 8 as a challenge to the 
cryptographic community. Users who would not like to commit on such a bet 
can however freely use a higher number of rounds in the present DFCv2 version 
(for instance, r = 12 as recommended by Biham). 



6 Conclusion 

We have presented an updated version of DFC in which we changed the key 
schedule and introduced scalable parameters. These modifications left the secu- 
rity results unchanged (except the weak key attack which has been fixed). 

Despite of the controversy during the AES process, we have shown that 
DFCv2 is one of the fastest block ciphers (on 64-bit microprocessors which have 
an optimized multiplier for m = 128) and benefits from some formal security 
results in addition to regular heuristic arguments. 
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Although this first generation of decorrelated ciphers may still be improved 
by the research community, we hope this paradigm will be useful to develop 
future cryptographic algorithms. 
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A Test Vector 

A test vector for the nominal choice of parameters (m = 128, k S {128, 192, 256}, 
r = 8 and s = 4) is included below. 

We have chosen to use KS as key and Ox as plaintext. We recall the value of 
KS: 

86dlbf27 5b9b241d eb64749a 47dfdfb9x 
6632c3eb 061b6472 bbf84c26 144e49c2x 

The key schedule tests all KAB entries but KABi and KAB 12 (which are not 
used with this choice of parameters) . It results in the following subkeys: 



round 



subkeys 



1 05c5bd24 aa6ba7df 0846cb21 elab0dc7x 

2 63b67a97 142061ce c034fd75 ea2cd3d9x 

3 abf20d20 9b963b4c f04efdd6 2a6c459dx 

4 27215d71 2b28c6cb e2f472eb 288d47e8x 

5 02aae49f caf2ddf3 60405bld d0d269a7x 

6 2a516cdc 6270af2b f3db8f26 c26ea9ebx 

7 94d3b898 ccbca828 4f6af 189 39230738x 

8 6c9d3c7e d7059bcc 7a3d4288 f232b634x 
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The iterated encryptions of plaintext Ox tests all entries in the RT table for 

J = 64. 

j DFCv2^,g 



0 00000000 00000000 00000000 OOOOOOOOx 

1 Iba5af95 aba096ed 5b6c9750 2fe7efa2x 

2 0f36105c 1302d52a e47d6d42 dfaaf5c7x 

3 bb58f671 54c59d52 fefb03a8 74cl38c5x 

4 acc4cf76 6505c09f 5ffel0d5 b021d66cx 
8 62395cc6 ba7bfl58 f78b5897 04aldb59x 
16 387c4222 c61f5e69 7946e251 eb40031ax 
32 4ab38d66 16247c2a efbeOcde 4d302a86x 
64 ee043b7d a8610c46 3e282198 c93887b4x 
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Abstract. This paper proposes a nested (hierarchical) SPN structure 
and the symmetric block cipher “Hierocrypt” . In the nested SPN struc- 
ture, lower-level SPN structures are recursively embedded into S-box 
positions in SPN of the higher level. This structure recursively assures 
the lower bound of active S-box number, and high security level is effi- 
ciently realized. The 8-round Hierocrypt is implemented in C language 
on Pentium HI, and shows the middle-class performance of final AES 
candidates. 



1 Introduction 

The substitution-permutation network (SPN, for short) is one of the most impor- 
tant structures besides the Feistel network. The wide trail strategy is effective 
for an SPN cipher to achieve high security against the differential and linear 
cryptanalysis [1,2, 3]. 

The optimal invertible linear mapping of diffusion layer is an essential com- 
ponent for the wide trail strategy. The mapping is usually called the maximum 
distance separable (MDS) mapping [2,4,5]. MDS mapping is optimal for the 
number of active S-boxes, which ensures the upper-bound of the characteristic 
probability for differential and linear cryptanalysis. 

Rijmen et al. designed the 64-bit block cipher SHARK, where eight-parallel 
8-bit S-boxes is mixed by the permutation layer, and the number of active S- 
boxes in the two consecutive layers is at least 9 [2]. It seems that the structure 
of SHARK is effective for a larger block size. But, straightforward extension to 
a larger block size has a disadvantage that the calculational cost for MDS part 
is proportional to the square of block size^. 

As a solution to the problem, Daemen, Rijmen et al. proposed the 128-bit 
ciphers SQUARE and Rijndael, where sixteen 8-bit S-boxes are divided into four 
parts composed of four S-boxes, and a local MDS operation is applied to each 
of them [4,5]. Although the minimum number of active S-boxes in consecutive 
two rounds is only 5, any trail of four consecutive rounds has at least 25 active 
S-boxes. 

^ The MDS operation consists of matrix multiplication where the number of matrix 
elements is proportional to the square of matrix size 
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We propose a new class of SPN structure, a nested (or hierarchical) SPN 
structure, and the cipher Hierocrypt^ based on the structure in this paper. The 
nested SPN structure is a multiple-level recursive structure (See Figure 1), and 
recursively assures the lower bound of active S-box number, and high security 
level is efficiently realized. The nested SPN structure can be regarded as an 
generalization of the SQUARE/Rijndael-type cipher. The generalization makes 
it possible to improve the security against the SQUARE attack. 

The construction of this paper is as follows. In the following section, the 
nested SPN structure is introduced. In Sect. 3, we show an overview of the ci- 
pher Hierocrypt, which is composed of the nested SPN structure. In Sect. 4, we 
describe how the components of Hierocrypt are designed. In Sect. 5, the security 
of Hierocrypt against some attacks is discussed. Sect. 6 shows the software per- 
formance of 8-round Hierocrypt on some CPU. The final section is devoted to 
the concluding remarks. 



2 Nested SPN Structure 

A nested (hierarchical) SPN structure is a multiple-level structure, where an 
S-box in a certain level consists of a 1-level lower SPN (See Figure 1). 




Fig. 1. Nested SPN structure 



Fig. 2. 4-round netsted SPN ciphe 



We impose the following conditions to realize the wide trail strategy efficiently 
[2,4,5]. 

(a) The final round of SPN consists only of an S-box layer (not followed by a 
diffusion layer) in all levels; 

(b) All permutations are MDS in each level; 

(c) The number of rounds is even in all levels except for the highest; 



2 



Two versions of the cipher (Type-I and Type-II) is proposed in this paper 
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(d) Bit-wise key additions are located directly before all lowest-level S-box layers 
and directly after the final; 

Figure 2 shows an example of two- level nested SPN cipher, which consists of 3 
rounds of the higher-level SPN, whose S-boxes have a 2-round SPN structure. 
The number of parallel S-boxes is 4 for both levels. When the size of S-box is 
8-bit, the block length of the cipher is 128-bit. 

The following proposition is important to realize the wide trail strategy in 
the nested SPN. 

Proposition 1. Consider 2-level nested SPN with the following conditions: (i) 
the above conditions (a)^(c) are satisfied; (ii) the lower-level SPN is 2-round; 
(iii) the number of parallel S-boxes are mi and m2 for the higher- and lower-level 
SPN, respectively. Then, any 2 consecutive higher-level rounds contain no less 
than (mi -|- l)(m 2 -l- 1) lower-level active S-boxes, for nonzero differential/mask. 
Proof. For non-zero differential or non-zero mask pattern, there are no less than 
(mi -I- 1) active higher-level S-boxes, each of which has no less than (m 2 -I- 1) 
active lower-level S-boxes. Thus, at least (mi -I- l)(m 2 -I- 1) lower-level S-boxes 
are active. □ 

To construct a nested SPN cipher which satisfies the above condition (b), an 
MDS code with a large word-size is needed. Here, let the {n, k, d) code be a code 
where n is the block length, k is the number of information digits, and d is the 
minimum distance. For the case of Figure 2, we need (8,4,5) error-correcting code 
over 32-bit word set for the higher-level diffusion. Although the Reed-Solomon 
code over GF(2^^) satisfies the condition, calculation over GF(2^^) is often costly. 
An alternative way is construction by concatenating parallel smaller MDS-codes, 
which is based on the following proposition. 

Proposition 2. Let MDSn be an MDS mapping defined by a (2m, m,m -I- 
l)-code over the n-bit word set. Then an MDS mapping MDSm'n based on 
(2m, m,m -I- l)-code over m'n-bit word set is constructed by concatenating m' 
sets of MDSn- 

Proof. The proposition is proven by constructing an example of (2m, m, m -|- 1)- 
code over m'n-bit words. 

Gonsider m' sets of mapping MDSn. 

MDSn ■■ Xij\\x2j\\ ■ ■ ■ \\Xmj ' > yij\\y2j\\ ' ' ' hmj , l<j<TO , 

and define the concatenation as follows. 

Alj — :^il||^z2|| * * * j Fi — yzl||yi2|| * ’ ’ i 1 ^ ^ ^ ^ ■ 

Then define the following mapping MDSm'n 

MDSm'n : Xi\\X 2 \\---\\Xm^Yi\\Y 2 \\---\\Ym ■ 

For nonzero differential/mask, at least one of the m' sub-mappings MDSn is 
active, therefore at least {m + 1) m'n-bit words from {Xi} and {Yi} are active. 
This means that the mapping is MDS. □ 
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Fig. 3. SQUARE in nested SPN form 



Fig. 4. Rijndael in nested SPN form 



The above construction of MDSm'n is a fundamental one. The mapping is 
generalized by putting invertible linear transformations on input and output 
words of m'n-bit length. 

The above construction of MDS mapping gives a new viewpoint for SQUARE 
and Rijndael. Figures 3 and 4 respectively show mathematically equivalent forms 
of four-round SQUARE and Rijndael. All MDS are the same. The central large 
rectangle corresponds to the MDS of higher level, consisting of 4 parallel sub- 
MDS of (8, 4, 5)-code. Thus, the parameters are m = m' = 4 ; n = 8, which 
guarantees that no less than 25 S-boxes are active in four consecutive rounds. 
That is, the same code is used in both-level MDS mapping for SQUARE. 

3 Overview of Hierocrypt Encryption 

Hierocrypt is a two-level nested SPN cipher with which has the following fea- 
tures. 

(of) The size of lower- level S-box is 8-bit; 

{(3) The number of parallel S-boxes is 4 in both levels; 

(7) Lower- level structure is 2-round SPN; 

(5) Diffusion layers of both levels consist of MDS mapping defined by (8, 4, 
5)-code. 

The lower-level diffusion MDS\, is based on (8, 4, 5)-code over GF(2®). 
Two types of higher-level diffusions MDSn are designed on (8, 4, 5)-codes over 
Gp( 232) GF(2"‘), respectively. 

For respective Galois fields, the following primitive polynomials are used in 
this paper. 



P4{z) = z'^ + z + 1 , 

ps{z) = z^ + z^ + z^ + z + l , 

P 32 {z) = z^^ +Z^^ +Z^^ + Z + 1 , 



for GF(24) , 
for GF(28) , 
for GF(232) ^ 
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3.1 Lower-Level Components of Hierocrypt 

The lower-level SPN structure of Hierocrypt consists of three components: bit- 
wise key addition AK, byte substitution [S'], and diffusion [MDS\J\. 

Bit-Wise Key Addition AK (1-Bit X 128). 128-bit half-round key K is 
added to the data (See Sect. 3.4 for the key scheduling). 

AK{K) : A I — > y Y = X®K . 

Byte Substitution [S] (8-Bit X 16). The byte substitution [S] consists of 
parallel operations of S-box S (See Sect. 4.1). 

[S] : A I — >Y yij = S{xij) , i = 1,.. . ,4 ; j = 1, . . . ,4 . 

The square bracket [ ] means parallel operation. 

Lower-Level Diffusion (32-Bit x4). The lower-level diffusion 

MDSl mixes 4 parallel bytes (See Sect. 4.2). 

[MDSl] : A^y ^ y, = MDSl(A,), z=1,...,4. 

3.2 Higher-Level Components of Hierocrypt 

The higher-level SPN consists of two components: parallel 4-byte substitution 
[AS"] and higher-level diffusion MDSu, except for the final key addition. 

4-Byte Substitution [AS'] (32-Bit X 4). 

[AS] : A^y ^ y = AS(A), * = 1,...,4. 

4-byte substitution [AS] consists of the three kinds of lower-level components, 
combined as follows. 

[AS] = [S] o AK {K^) o [MDSl] o [S] o AK (A“) , 
where A“ and K^ is the first and the second half of round key (See Sect. 3.4). 

Higher-Level Diffusion MDSh (128-Bit Xl). 

MDSn : A I — >Y . 

Two types are given in Sect. 4.3 and 4.3. 

3.3 Round Functions and Encryption 
Round Function p (Except for Final). 

p = MDSn o [AS] . 
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Final Round Function p' . 

P' = [^5] . 

Hierocrypt Encryption. The Hierocrypt encryption of T rounds consists of 
(T — 1) iterations of round function p followed by the final round function p' and 
the final key addition. 

Enc= o p'{K^^'>) o p{K^^~^^) o ■■■ o piK^"^^) o p{K^^^) . 

3.4 Key Scheduling 

The key scheduling part consists of an initial key expansion KX and iterative 
key generations KH. 

= KX {K) , 

iCd) = KH , (1 < t < 2T + 1) . 

The data randomization part requires two-round iterations oi KH per round. 



Initial Key Expansion. The initial key expansion KX expands an encryption 
key K (128/192/256 bits) up to 256-bit by padding. The 32-bit key data Ki is 
represented as concatenation of four 8-bit data. 



Ki = kii\\h2\\ka\\kiA ■ 

[128-bit key] 

K = K4K2\\K^\\Ki 

= kii\\ki2\\kin\\ki4\\k2l\\k22\\ ■ ■ ■ 11^43 p44 • 

= Ki||K2||K3||K4||Ki||K2||K3||K4 . 

[192-bit key] 

K = Ki||K2||K3||K4||K5|lK6 

= ^ll||fcl2|| • ■ ■ 11^4311^4411^5111^5211 ‘ ‘ ’ 11^6311^64 ■ 
= Ki||K2||K3||K4||K5|1K6||Ki||K2 . 

[256-bit key] 

= iCi||iC2||K3||K4||K5||K6||K7||K8 . 



Key Round Function. The key round function transforms the {t — l)-th 
intermediate key into the t-th one kW, 

= KH . 
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Fig. 5. Overview of the key scheduling 



The key divides into two halves: (left half) and (right half), 

The output of key round function corresponds to the round key for data ran- 
domization as follows. 

a:(*)“ = = ATA^^*) . 

The outline of the key round function is as follows (See Fig. 5). 

• [right I — > left] 

[MDSi^] o [S'] : ^ . 

where [S] is given in Sect. 3.1.2, and [MA>Sl] is given in Sect. 3.1.3. 

• [left I — > right] 

AA>+ o AA) oMASh : ^ KR^*K 

where MDS^ is given in Sect. 3.2.2, AD^ (G^*^) is the addition of round constant 

gw. 

A'Af ) = AAf ^ + Gf ^ (mod 2^2) , (i = 1, . . . , 4) . 

The constants G^*^ are given in Table 1. 
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Table 1. Round constant 



G(2) ={H3,H2,Ho,H3) 
G(3) = {Hi, Ho, Ho, Ho) 
GW ={Hr,Ho,Hr,H3) 
G(®) ={Ho,Hi,Ho,H2) 
G(®) ={H3,H2,Ho,Ho) 
GC^) ={Hi,H2,Hi,Ho) 
G(®) ={H2,Hi,H2,H3) 
G(") ={H2,Hi,Ho,Ho) 
= {Hi, Hi, Hi, H2) 

G(“> = {H3,Hi,Hi,H2) 

G(i2) = {Hi, Hi, H 2 , Ho) 
G(i3) = {Hi, Hi, H 3 , Hi) 
= {H 2 , Hi, Hi, Hi) 
G(i®) = {Hi, Hi, Hi, Ho) 
G(i«) = {Hi, Ho, Ho, Hi) 
= {Hi, H 2 , Ho, Hi) 



Generation of Round Constant To prevent the weak key generation 

such as a cyclic pattern in round key sequence, we introduce 32-bit constant 
parameters Cf\t = 1, ... ,17; i = 1, ... ,4), which are given by Table 1. Here, 
the constants Hi{i = 1,...,4) is given as follows (The prefix “Ox” indicates 
hexadecimal numbers). 

Ho = 0x5A827999 = trunc (\/2/4) , Hi = 0x6ED9EBAl = trunc (^3/4) , 

H 2 = OxSFlBBCDC = trunc (\/S/4) , H 3 = 0xCA62ClD6 = trunc (\/l0/4) , 

where Hrunc’ is the truncation function which is defined by using the floor 
function^ . 

trunc{x) = [2^^a;J . 

Table 1 is given by the following simple rule. We use the eight-bit linear feedback 
shift register (LFSR) of Fibonacci type, of which the primitive polynomial is 
+ z'^ + z'^ + + 1, and the initial state is z'^ + z'^ + z + 1. The LFSR generates 

a bit sequence ^i, C 2 , Cs) • ■ Successive two bits of them determines the suffix of 
Hj. Specifically, the z-th constant of the t-th round is given as follows. 

G\ ^ 1 *) = 2C8(t-l)+2i-l + C8(t-l)-|-2i ■ 

4 Design of the Components 

We describe how the components of Hierocrypt are designed in this section. 



3 



The floor function [a;J is the largest integer no more than x 
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In the component design, the maximum differential and linear probabilities 
are the most important security measures for block ciphers. The maximum dif- 
ferential probability for the function / is defined as 



dp-^ = max 
Ax^0,Ay 



# {x\f(x) © /(x © Ax) = Ay} 



( 1 ) 



Similarly, the maximum linear probability for the function / is given as 



Ip^ = max 
Fx,ry^0 



#{x|x • rx = f{x) ■ ry] 

2" 



1 

2 ■ 



(2) 



4.1 Lower-Level S-Box S 

The lower-level S-box S is given as follows (in hexadecimal expression) . 

(S'(O) S{1) ■■■ 5(F) 5(10) ••• 5(FF)) 

= ( 72 AA 49 16 IE 3A 43 AE 66 BC 00 73 79 3B FB 9F 
69 6A A2 50 6E F5 EF AC 22 02 AD 26 E2 DF 97 FO 
9E BF 17 8B FA 7C F4 71 7F CA F6 52 FD C3 E5 64 
53 8D EO F3 OF 78 CB 9B 68 3C OD IF 89 B6 EB F7 
44 4A 06 A6 56 6B 85 01 30 88 51 31 9C AO A3 25 
60 5B FF 05 B7 91 15 B3 A9 20 03 2B 61 42 95 4D 
F9 7E OE E9 D8 FI 46 99 CEBE D9 54 80 BO D2 4F 
7 A E8 35 92 IB 7B 12 D6 4C D5 E7 EE B1 24 DE 21 
04 10 AB 29 9A 81 FE A7 B8 63 28 OA 8A D1 C6 07 
B9 C8 98 82 74 9D 84 47 94 C7 6C 11 D7BAC1 C9 
DD 77 39 2F 2E C2 67 41 E4 58 34 CD 1C 93 96 7D 
2C F8 B5 70 14 08 DCCC 87 DO 5E 32 C5 C4 59 3E 
CF 55 5C 23 75 2D 2A 86 4B ID 5F E6 FC B2 4E 09 
27 AF 19 B4 BD 6D 3D 6F ED 62 EA F2 D3 36 38 DB 
BB 83 45 37 A4EC 8C 5D El 33 90 Al 40 8E lA A5 
OB 3F 5ADA 13 76 OC CO 48 E3 65 A8 18 8F D4 57 ) 



The maximum differential probability of the lower-level S-box 5 is 



dp^ = 



6 



And the distribution of differential probabilities for nonzero input differentials 
is shown in Table 2. 

Similarly, for the maximum linear probability. 



Ip^ = 



22 



And the distribution of the linear probabilities for nonzero output mask patterns 
is shown in Table 3. 

The algebraic order of the S-box is 7-th, which is the highest value for 8-bit 
bijection. 
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Table 2. Differential probability distribution 



differential probability (/256) 


num. 


6 


50 


4 


1,899 


2 


28,692 


0 


34, 639 


total 


65, 280 



Table 3. Linear probability distribution 



linear probability (/256) 


num. 


22 


20 


20 


157 


18 


577 


36 


579 


0 


5,748 


total 


65,280 



4.2 Lower-Level Diffusion MDSj^ and Higher-Level S-Box XS 



The security of higher-level S-box XS depends on the combination of S-box 
S and permutation MDS\^. We have chosen MDS\^ from randomly generated 
MDS matrices over GF(2®), such that the differential and linear properties are 
better than the usual case where all active S-boxes take the worst probability. 

As the branch number of MDSj^ is 5, the maximum differential and linear 
characteristic probabilities: and , respectively satisfy the following 

inequalities. 



DpXS < 



6 



^ 2 



- 27.1 



j^pXS < T 1 2 



22 



^ 2 



- 13.7 



The above inequalities are satisfied, only if the lower-level permutation 
MDS'l is an MDS mapping. We succeeded in improving the differential and 
linear properties by selecting the following MDS\^. 



MDSl{X,) = DlA, , 

/6C 25 9B 03 \ 

6D 06 C8 18 
“ 75 78 9E IF ' 

\ 42 78 EB 61 

The matrix elements are expressed in hexadecimal and regarded as elements 
of GF(2®). For example, the elements 25 is regarded as the polynomial z^ + z'^ + 1 
of GF( 28 ). 
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0x25 = 1 • 2^ + 0 • 2^ + 0 • 2^ + 1 • 2^ + 0 • 2^ + 1 • 2° 

- + Q- z‘^ + Q- + 1- + Q- + 1- + l . 



We use the following criteria for selection. 

— At least one S-box has a differential probability no more than (4/256) , for any 
nonzero input differential of which active S-box number is 5 (See Table 2). 

— At least one S-box has a linear probability no more than (20/256), for any 
nonzero output mask pattern of which active S-box number is 5 (See Table 3). 

If MDS'l satisfies the criteria, the inequalities for XS are refined as follows. 



Dpxs 

Lpxs 




4.3 Higher- Level Diffusion MDSu 

The higher-level diffusion MDS-h. is based on (8,4, 5)-code over 32-bit words. 
We give two examples, [Type-I]: mapping based on (8,4,5)-code over GF(2^^); 
[Type-II]: concatenation of 8 parallel mappings based on (8,4, 5)-code over 
GF(2'‘). 



Type-I MDSu- Type-I MDS-n is selected from randomly generated matrices 
over GF(2^^). For calculational efficiency, we impose the condition that only low- 
est 5 bits can be nonzero for all matrix elements. Multiplication with a constant 
over GF(2^^) reduces to 4 times of table-lookup where the respective inputs are 
8-bit long. 



MDSh{X) 



Dn 



DhX , 

/05 19 06 1B\ 
IB 05 19 06 
06 IB 05 19 
\ 19 06 IB 05 



Type-11 MDSu- Type-II MD5 'h is based on the following 4x4 matrix Dh, 
which is selected from randomly generated matrices over GF(2^). 



MDSn ( A) = Dh A , Dh = Dh ® /g , 
/6 BDG\ 

G 6 B D 
“ D G 6 B 
\B D G 6/ 
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Here, /§ means 8-dimensional identity matrix. D\i ® /§ is eight parallel multi- 
plications of the matrix Z?h to the 16 -bit vector (regarded as four elements of 
GF( 2 '^)) which is made by picking up one bit from each byte. 

The byte-oriented form of mapping is as follows. 



/yii \ 




1 


1 


0 


0 


1 


0 


1 


0 


0 


1 


1 


1 


0 


1 


1\ 


/Xll 


TJ 12 




1 


0 


1 


1 


1 


0 


1 


0 


0 


0 


0 


1 


0 


1 


0 


1 




X12 


yi 3 




0 


1 


0 


1 


1 


1 


0 


1 


1 


0 


0 


0 


1 


0 


1 


0 




Xl 3 


2/14 




1 


1 


0 


0 


1 


0 


1 


1 


0 


1 


1 


1 


0 


1 


1 


0 




Xli 


2/21 




1 


0 


1 


1 


0 


1 


1 


0 


0 


1 


0 


1 


0 


0 


1 


1 




X21 


2/22 




0 


1 


0 


1 


1 


0 


1 


1 


1 


0 


1 


0 


0 


0 


0 


1 




X22 


2/23 




1 


0 


1 


0 


0 


1 


0 


1 


1 


1 


0 


1 


1 


0 


0 


0 




X23 


2/24 




0 


1 


1 


0 


1 


1 


0 


0 


1 


0 


1 


1 


0 


1 


1 


1 




X24 


2/31 




0 


0 


1 


1 


1 


0 


1 


1 


0 


1 


1 


0 


0 


1 


0 


1 




X31 


2/32 




0 


0 


0 


1 


0 


1 


0 


1 


1 


0 


1 


1 


1 


0 


1 


0 




X32 


2/33 




1 


0 


0 


0 


1 


0 


1 


0 


0 


1 


0 


1 


1 


1 


0 


1 




X33 


2/34 




0 


1 


1 


1 


0 


1 


1 


0 


1 


1 


0 


0 


1 


0 


1 


1 




X34 


2/41 




0 


1 


0 


1 


0 


0 


1 


1 


1 


0 


1 


1 


0 


1 


1 


0 




X41 


2/42 




1 


0 


1 


0 


0 


0 


0 


1 


0 


1 


0 


1 


1 


0 


1 


1 




X42 


2/43 




1 


1 


0 


1 


1 


0 


0 


0 


1 


0 


1 


0 


0 


1 


0 


1 




X43 


\ 2 / 44 / 


\1 


0 


1 


1 


0 


1 


1 


1 


0 


1 


1 


0 


1 


1 


0 


Qj 


\ X44 



5 Security 

5.1 Differential and Linear Cryptanalysis 



The branch number is 5 for both mappings MDSu and MDSi^, Thus, Hierocrypt 
contains at least 5 active XS per two rounds, which contain at least 5 active 
S-boxes, for a non-zero differential/mask. Therefore, Hierocrypt contains at least 
25 active S-boxes per two rounds (See Proposition 1 ). 

Then, the maximum differential and linear characteristics respectively satisfy 
the following inequalities. 



2 rounds 



DP 



rounds ^ 



6 



25 



2 — 135.3 2 — 128 



22 



25 






This estimation is for the general MDSi^. The characteristics for MDSj^ in 
Sect. 4.2 is a little better (Sect. 4 . 2 ). 



DpXS < 2 - 27-7 



LpXS < 2 - 13-8 



DP 



2 rounds 



< ( 2 - 27 - 7 )® ^ 2 - 138-3 < 2 



-128 



2^^2 rounds <^ _ ^2 . 



The above result shows that all two-round differential characteristics are 
far below the critical noise level 2-i^®^ ^nd that the maximum 2-round linear 
characteristic is equal order of the critical noise level 2 -®^. Then, we recommend 
to use Hierocrypt with no less than 6 rounds, where intermediate 4 rounds are for 
sufficiently small characteristics, and 2 remaining rounds on both ends against 
partial exhaustive key search. 
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5.2 Other Cryptanalysis 

Besides the differential and the linear cryptanalysis discussed in the previous 
subsection, there are many other attacks. We discuss the security against the 
SQUARE (dedicated) attack and the higher-order differential cryptanalysis here 
[4,8]. 

The SQUARE attack is presented by J.Daemen, L.R.Knudsen, and V.Rijmen 
in their paper proposing the block cipher SQUARE [4] . The attack is applicable 
to SQUARE and other ciphers with similar SPN structures, such as Rijndael or 
CRYPTON [5,9]. The attack is applicable to Hierocrypt as well, because of its 
SPN structure. Here, we regard one round of Hierocrypt as two “half-rounds” 
to compare the strength with SQUARE^. 

The basic version of SQUARE attack is applicable to 4-round SQUARE and 
4-roimd Rijndael. The attack is based on the following property, that all (16) 
bytes of the 3rd round output are always balanced over the A-set input with 
only one active byte. Thus, the 128 key bits can be identified by the basic attack 
with 2® plaintexts and 2® encryption time for SQUARE and Rijndael. 

On the other hand, only two bytes of the 3rd half-round output are always 
balanced on the same condition. This property reduces to the fact that only 
64 key bits can be identified by the basic attack with 2^^ plaintexts and 2^^ 
encryption time for Hierocrypt (Type-H). Thus, the efficiency of basic attack for 
Hierocrypt(Type-H) is 1/8 of that for SQUARE/Rijndael. 

Further study shows that Hierocrypt is stronger in any extended versions of 
the dedicated attack. 

Next, we discuss about the higher differential cryptanalysis. It is known that 
the cryptanalysis is applicable to /CAfcipher, which is provably secure against 
the differential and linear cryptanalysis [8]. The security against the higher dif- 
ferential cryptanalysis is estimated by the algebraic order. The algebraic order 
of Hierocrypt ’s S-box is 7. And the order after 3 S-box layers (1.5 layers in 
Hierocrypt convention) is roughly estimated as 

= 343 > 128 . 

Therefore, the cryptanalysis does not seem to be feasible. 



6 Performance 

The cipher Hierocrypt is implemented in four Microprocessors. The measured 
encryption rates are shown in Table 4. 

As the implementation here is rather basic one, the performance is expected to 
improve by optimizing the implementation. 



^ One half-round of Hierocrypt corresponds to one round of usual SPN cipher, as it 
contains one S-box layer 
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Table 4. Speed of Type-I Hierocrypt encryption(8 rounds) 



Processor 


Freq 


Platform 


Compiler 


Throughput 


Pentium HI 


550MHz 


WindowsNT 4.0 


YC++ 6.0 


43.18 Mbps 


Celeron 


466MHz 


Linux kernel- 2. 2. 5 


egcs 1.1.2 


31.74 Mbps 


Ultra SPARC II 


296MHz 


Solaris 2.5.1 


gcc 2.95.1 


15.90 Mbps 


Alpha 21 164 A 


599MHz 


Digital UNIX V4.0D 


gcc 2.95.1 


27.04 Mbps 



Table 5. Speed of Type-II Hierocrypt encryption(8 rounds) 



Processor 


Freq 


Platform 


Compiler 


Throughput 


Pentium HI 
Celeron 

Ultra SPARC II 
Alpha 21 164 A 


550MHz 

466MHz 

296MHz 

599MHz 


WindowsNT 4.0 
Linux kernel- 2. 2. 5 
Solaris 2.5.1 
Digital UNIX V4.0D 


VC-b-f 6.0 

egcs 1.1.2 
gcc 2.95.1 
gcc 2.95.1 


40.33 Mbps 

29.00 Mbps 
19.10 Mbps 

48.00 Mbps 



7 Concluding Remarks 

We propose the block encryption algorithm “Hierocrypt” based on a two-level 
nested SPN structure. We use MDS mappings for both-level permutations, which 
assures the minimum number of active S-boxes hierarchically. SQUARE and 
Rijndael can be regarded as the nested SPN ciphers, where the higher- and 
lower-level diffusion layers {MDSa and MDS^,) are made by using the same 
MDS matrix. On the other hand, the diffusion layers of both levels are designed 
independently for Hierocrypt. This independency is profitable to improve the 
security against many attacks including the SQUARE dedicated attack. 



Appendix 

A Improved Algorithm 

We have designed a revised version named “Hierocrypt-3” based on the Type- 
II algorithm. All components except for the nested SPN structure of the data 
randomization part are modified. The following modified components of data 
randomization are presented in this Appendix. 

1. S-box 

2. Higher-Level Diffusion MDSyi 

3. Lower-Level Diffusion MDSi, 

A.l S-Box 

The new S-box S{x) is given by the following table in hexadecimal notation. 



(S'(O) S'(l) ••• S'(F) 5(10) ••• 5(FF)) 
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= ( 07 FC 55 70 98 8E 84 4EBC 75 CE 18 02 E9 5D 80 
1C 60 78 42 9D 2E F5 E8 C6 7A 2F A4 B2 5F 19 87 
OB 9B 9C D3 C3 77 3D 6F B9 2D 4D F7 8C A7 AC 17 
3C 5A 41 C9 29 EDDE 27 69 30 72 A8 95 3E F9 D8 
21 8B 44 D7 11 OD 48 FD6A 01 57 E5BD 85 EC IE 
37 9F B5 9A7C 09 FI B1 94 81 82 08 FB CO 51 OF 
61 7F lA 56 96 13 Cl 67 99 03 5EB6CAFA 9EDF 
D6 83 CC A2 12 23 B7 65 DO 39 7D 3B D5 BO AF IF 
06 C8 34 C5 IB 79 4B 66 BF 88 4A C4EF 58 3F OA 
2C 73 D1 F8 6B E6 20 B8 22 43 B3 33 E7 FO 71 7E 
52 89 47 63 OE 6D E3 BE 59 64 EE F6 38 5C F4 5B 

49 D4 EO F3BB 54 26 2B 00 86 90 FFFE A6 7B 05 
AD 68 Al 10 EB C7 E2 F2 46 8A 6C 14 6E CF 35 45 

50 D2 92 74 93 El DAAE A9 53 E4 40 CD BA 97 A3 
91 31 25 76 36 32 28 3A 24 4CDBD9 8DDC 62 2A 
EA15DDC2A5 0C 04 ID 8FCBB4 4F 16 ABAAAO) 

s(a;) = Add{Power{Perm{x))) . 
where Perm is a bit permutation 

Perm : GF(2)® ^ GF(2)® , 

Vi ^7r(i) 5 



Table 6. Bit permutation for S-box 



i 


1 


2 


3 


4 


5 


6 


7 


8 


7r(i) 


3 


7 


5 


8 


6 


2 


4 


1 



Power is the power of 247 over GF(2®) with the primitive polynomial z® + z® + 
+ z + 1. 



s : GF(2®) ^ GF(2®) , 
s (x) = , 

Add is a constant addition. 

Perm : GF(2)® ^ GF(2)® , 

Add (x) = X © 0x11 . 

The power function Power has the same probabilities, both the bit permu- 
tation Perm and the constant addition Add do not change them. 

Perm is chosen so that the number of polynomials of output bits is the 
maximum, in order to improve the security against the interpolation attack. 

Add is chosen so that the distribution of input and output hamming distances 
is nearest to that of random function, in order to remove the statistical bias. 

The main purpose of S-box modification is to improve the security against 
the differential and linear cryptanalysis. The maximum differential and linear 
probabilities of the S-box are 2“® and 2“^, respectively, which are proven to 
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be minimum theoretically. The maximum differential and linear characteristic 
probabilities for two rounds are respectively estimated as 2“^®° and 2“^® when 
the above S-box is used. This estimation indicates that both differential and 
linear characteritic probabilities saturate after two rounds. 



A. 2 Lower-Level Diffusion MDSj^ 

The lower-level diffusion MDS\^ is given as follows. 

MDS'l : GF(2®)'‘ ^ GF(2®)'‘ , 



/ ViA 




/G4 65 G8 8B\ 


/ Xii\ 


ya 




8B G4 65 G8 


Xi2 


ya 




G8 8B G4 65 


Xi3 


\yi4 y 




65 G8 8B G4 / 


\a;i 4 / 



We have chosen this maxtrix from circulant MDS matrices so that the output 
of SP-function (composite function of S-box and MDS'l) has the maximum 
number of polynomials. 



A. 3 Higher-Level Diffusion MDSn 

The higher-level diffusion MDS^ is given as follows. 



MDSh{X) = 
Dh = 



DhX , Dh 
/5 5 AE\ 
E 5 5 A 
A E 5 5 
\5 AE 5/ 



Dh ® /s , 



/ yii \ 


/I 


0 


1 


0 


1 


0 


1 


0 


1 


1 


0 


1 


1 


1 


1 


1 \ 


\ 


yi2 




1 


1 


0 


1 


1 


1 


0 


1 


1 


1 


1 


0 


0 


1 


1 


1 




2^12 


yi3 




1 


1 


1 


0 


1 


1 


1 


0 


1 


1 


1 


1 


0 


0 


1 


1 




a : i 3 


2/14 




0 


1 


0 


1 


0 


1 


0 


1 


1 


0 


1 


0 


1 
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The MDS matrix is chosen so that any byte is connected to all bytes after 
one round (one MDSi, and one MDSn) through more than one intermediate 
bytes. This condition is imposed to improve the security against the SQUARE 



dedicated attack. Our evaluation shows that the condition makes the nested SPN 



cipher at least one half-round stronger than SQUARE and Rijndael against the 
attack. 
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Abstract. We introduce a new family of symmetric block ciphers based 
on group bases. The main advantage of our approach is its full scalabil- 
ity. It enables us to construct, for instance, a trivial 8-bit Caesar cipher 
as well as a strong 256-bit cipher with 512-bit key, both from the same 
specification. We discuss the practical aspects of the design, especially 
the choice of carrier groups, generation of random group bases and an 
efficient factorization algorithm. We also describe how the cryptographic 
properties of the system are optimized, and analyze the influence of pa- 
rameters on its security. Einally we present some experimental results 
regarding the speed and security of concrete ciphers from the family. 



1 Introduction 

A good block cipher should possess several properties. In addition to security 
and efficiency, which are essential, there are other important attributes like 
generality, scalability and theoretical foundations. In what follows we discuss 
these properties in more detail. 

A block cipher can be characterized by two basic parameters: the block length 
n and the key length k, both expressed as a number of bits. For each of the 2^' 
possible keys, the cipher defines a bijective mapping between the 2" plaintext 
blocks and the 2" ciphertext blocks. As the plaintext and ciphertext spaces are 
usually the same, we can view an n-bit block cipher as defining a permutation 
on a set of 2” elements for each possible key. A simple key-indexed lookup table 
containing all 64-bit numbers in random order would implement a very strong 
64-bit block cipher, without any additional algorithm. Unfortunately, such an 
implementation would take so much memory, that it would not be applicable 
for any practical use. Almost all modern block ciphers simulate such a large 
lookup random table using smaller tables (S-Boxes) in combination with other 
transformations. The goal is to make the dependence between the plaintext, 
ciphertext and key so complex that it is virtually indistinguishable from the 
random case. 



D.R. Stinson and S. Tavares (Eds.): SAC 2000, LNCS 2012, pp. 89—105, 2001. 
(c) Springer-Verlag Berlin Heidelberg 2001 
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As 2" elements can be permuted in 2"! different ways, a “perfect” n-bit block 
cipher could accept keys of length up to [/o52(2"!)J bits. This means a 1683- 
bit key for n = 8 and approximately a 10^^-bit key for n = 64. Of course, no 
one needs such long keys and it would be extremely impractical to use them. 
These large numbers show, however, the strong potential of block ciphers and 
the restricted generality of current systems which use by design a fixed key- 
length or fixed S-boxes. These ciphers are in our opinion not flexible enough. 
They are constrained to one specific configuration and the functions defined by 
them might be far from random permutations. 

Another frequent drawback of block ciphers is a small or totally missing 
scalability. Because of the unprecidented growth rate of computer power available 
to the public, it is highly desirable to have choices for some basic parameters 
of the cipher. If our ciphers were fully scalable, we could just adapt the values 
of these parameters, when some new, amazing breakthrough in processor or 
memory technology occurs. The values of the parameters n and k could be 
easily changed without a complete redesign of the cipher and we would not be 
forced to throw away all research on the properties of the cipher, starting again 
from the beginning. 

For example, the block length of DES is 64 bits. If we wish to create a 128-bit 
version of DES, we would have to design new, larger S-Boxes. As the design of 
good S-Boxes is by far a non-trivial task [I] , the properties of the new DES could 
be quite different. This, in fact, would be a totally new cipher. Another cipher 
- IDEA [2] - has a very plain structure and its block length might be doubled 
simply by increasing the length of each of the four subblocks from 16 to 32 bits. 
The fact, however, that 2^® -|- 1 is a prime number is essential to the functionality 
of IDEA. Since 2^2 1 is not a prime, the double version would not work well. In 

contrast to DES and IDEA there are already some nice examples of well-founded 
scalable designs available like RC5 [12], RC6 [13] or Rijndael [14]. 

Each new cipher should be studied extensively, perhaps for several years, be- 
fore it is deemed trustworthy and is presented for widespread use. If the cipher 
is based on a strong theoretical foundation, we can gain a better understand- 
ing of possible failures, cryptanalytic attacks, etc., and we have stronger tools 
with which to analyze the new algorithm. Therefore, a cipher based on strong 
mathematical foundations will either be rejected outright, or if it appears work- 
able, there should be a reasonable chance for it to have provable reliability and 
trustworthiness . 

All in all, we think that an ideal cipher should not only be secure and fast but 
also theoretically well-founded, general, and scalable. In this paper we present a 
new family of fully scalable block ciphers which is quite general. Our approach 
enables the construction of a range of ciphers, from a tiny toy cipher to a large, 
secure one. The idea on which the encryption is based is a mapping of group 
elements between two random group bases. A subject which does not know the 
two secret bases is not able to recover the mapping. We discuss the selection 
of suitable carrier groups, the generation of random group bases which enable 
an efficient factorization and the optimization of the cryptographic properties of 
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the system. Finally, we discuss the security, speed and memory requirements of 
a concrete software implementation. 

2 The Principle of Encryption 

The ciphers we propose utilize group theory [3] . Although we focus our attention 
on permutation groups, it is also possible to construct a cryptosystem based on 
any carrier group in other representation forms. By a permutation of n symbols 
we understand a bijective function p : "Ln — > The Cartesian representation 

of p is the vector [p(0),p(l), . . . ,p(n — 1)]. This is the usual way to represent 
permutations in computers. Operation * on permutations is defined as follows: 
[p{0),...,p{n - 1)] * [g(0), . . . ,g(n - 1)] = [q{p{0)), . . . ,q{p{n - 1))]. The basic 
notion needed for the ciphers proposed in this paper is the idea of a Group Basis. 

Definition 1. Group Basis 

Let G he a finite group. A group basis for G is an ordered collection (3 = 
(Bo, .Bi, . . . , Bu,_i) of ordered subsets Bi = {bi^, bi^i, . . . bi^n-i) of G such that 
each element p € G can be expressed uniquely as a product of the form: 

P — * ^1,3:1 ' ■ ■ ; ^i,Xi C Bj 

The Bi are called the blocks of (3, the vector of block lengths r = (rp, ri, . . . , rw-i) 
is called the type of /3 and the number w the dimension of (3. Each p G G 
corresponds to a unique index vector x = {xq,xi, . . . ,Xw-i), where Xi G 
The space of all index vectors is A = x Z^^ x • • • x . The index set X 
has cardinality |A| = tq • ri • • • r^i,_i = |G|. 

A basis (3 describes a bijective mapping /3 : X — >• G as follows: 

/^(^) ^1; ■ • ■ ; ^w — l) ^O.Xq * ' ’ ' ^w—l^Xw—l P' 

When computing p = f3{x) we say that p is composed from factors bi^xi- Com- 
puting the inverse function x = f3~^{p) is called factorizing p with respect to 
(3. 

It should be noticed that the concept of group basis in this paper strongly 
differs from the notion base given in the standard literature of group theory (see 
e.g. [3]). For more discussion of group bases defined here (also called logarithmic 
signatures in [5]) the reader is referred to [5] and [6]. 

One can think of a group basis as a kind of w-dimensional discrete coordinate 
system as illustrated on Fig. 1. The six permutations of G might be seen as points 
in a 2-dimensional space. Any one of the six points can be expressed as a unique 
sum^ of two points, one from each axis. The two axes, the first with three and 
the second with two points, correspond to the two blocks of /3. 



^ Addition of points in this discrete geometry is defined by means of vectors as: 
(*1,2/1) -I- (*2,1/2) = (*i + *2 mod 3 , yi + 2/2 mod 2 ). Note that while this is a 
commutative operation, operation * in S3 is not. This is only an illustrative example 
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Example 1. A Group Basis for G = S 3 
G = {[0,1,2], [0,2,1], [1,0,2], [1,2,0], [2,0,1], [2,1,0]} 




w = 2, r = (3,2), Fig. 1. A coordinate system 

p=[l,0,2] = [2,0,l]*[0,2,l] = Vi*^i.i, 

P{p) = (1, 1) = a; 

The crucial property of group bases from the cryptographic point of view is 
that there is an enormous number of different group bases for a given group. We 
denote the set of all bases that generate G by Bq. For example, the tiny group 
S 3 of our example has 924 different bases. 6! of them are one-dimensional bases 
of type (6) and 2!. 2^. 3! -I- 3!. 3^. 2! of them are two-dimensional bases of types 
(2,3) and (3,2). Later we will describe how all these bases can be generated. The 
most basic version of a secret-key cryptosystem based on group bases is defined 
as follows: 

Definition 2. A Block Cipher Based on Group Bases 

Let G be a finite group, called the carrier group. Let A : '^\g\ — t G be any 
fixed bijective function. The plaintext and ciphertext spaces for the cipher are 
the same: V = C = Z|fj| . The key space is the set K. = Bq x Bq. 

Let k = k G K. be a secret key. Let x € V be a plaintext and y € C 

the corresponding ciphertext. The encryption function Ck '. V — > C is defined by 
the rule 

y = ek{x) = \~^02{Pi \a(o;)))) 

and the decryption function dk '. C — > V is defined as 

X = dk{y) = 0102 ^{X{y)))). 

In other words, we take two random group bases for G, fdi and j 32 , and each 
time we want to encrypt some p G G, we have to find such p' G G which has 
the same coordinates in /?2 as p has in ( 3 i . The function A only defines a unique 
numbering of the group elements. 

Again, for a better visualization, we take a simple example with geometric 
coordinates (Fig. 2). There are 16 numbered points in the space, thus we can 
encrypt and decrypt the plaintexts and ciphertexts from Zig. Suppose, we want 
to encrypt plaintext point 14. First we find the coordinates of the point 14 with 
respect to basis /3i. The corresponding index vector is (2,3). Now we compose 
the point, which has the same coordinates in ( 32 , this gives us point 10. Therefore 
®(/3 i,/ 32)(14) = 10. The complete table for is displayed on the right hand side. 
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2 3 Bo 



0 3 2 B 



Fig. 2. A mapping of points between two bases 



In a real-world application the dependencies become much more complex. A 
64-bit cipher can use for instance two 8-dimensional bases with 256 elements on 
each “axis” . Moreover, the carrier group will not necessarily be commutative. 

3 Implementation Aspects 

3.1 The Carrier Group G and Function A 

The size of the plaintext and ciphertext space depends directly on the order of 
the carrier group G. We are only interested in groups whose order is a power 
of two, the so called 2-groups or binary groups. More precisely, we should have 
|G| = 2®^, for a natural number k, because only ciphers whose blocks fit exactly 
in k bytes are interesting for a practical use. Note that |S'm| = m! 2” for any 
m > 2. Therefore the symmetric group Sm is not suitable for a carrier group. 

Group Z 2 • The simplest available 2-group is the elementary abelian group 



It contains the permutations of 2n symbols in form p = [oq, oi, . . . a 2 n-i] where 
for each pair of symbols a 2 k,a 2 k+i, k G 'Zn either a 2 k = 2/c and 02^+1 = 2/c -|- 1, 
or else U2k = 2k + 1 and 02 fc+i = 2k. 

The permutations of Z 2 can be represented very efficiently with our so called 
compact representation. The compact representation of a p = [oq, oi, . . . 02 ^- 1 ], 
p G Z 2 is the binary vector x = (xq,xi, . . . Xn-i) where Xi = 0 if and only 
if 02 i = 2i. Otherwise Xi = 1. In other words, the i-th bit of the compact 
representation indicates, whether the elements 02 i and 02 i+i have been swapped 
or not. In terms of memory requirements the compact representation is optimal, 
as it is impossible to represent the 2” elements in less than n bits. Another 
benefit of the compact representation is that it makes it possible to multiply 
permutations very fast. Note that if xi is the compact representation of pi 



n— times 
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and X 2 of p 2 , then X\ XOR X 2 is the compact representation of the product 
Pi *P 2 - The operation * in Z 2 is commutative and linear. Last but not least, the 
compact representation of the permutations fulfills the role of the function A from 
Definition 2. If we consider the vector (xq) xi, . . . Xn-i) as a binary representation 
of a natural number, we have a unique numbering of all permutations in the 
group. 

The group basis for Z 2 of the form a = (Aq, . . . , where each block 

Ai contains two permutations in compact representation, the identity ( 00 .^. . 0 ) 

i-l times n-i times n times 



and a single swap on the z-th place ( 0...0 1 0...0 ),is called the canonical 
basis for Z 2 . The one-element set c, = {z} is called the set of key bit positions 
for block A . 



Group Hs X 'Hi- In contrast with Z 2 , the most complex 2-group is Hs, the 
largest binary subgroup of S'„. When n = 2®, the order of T-Ls is 2^ 'Hs is also 
known as the Sylow 2-subgroup of 82 ^ ■ 

Definition 3. Sylow 2-subgroup Hg of the symmetric group Sn, n = 2®. 

The group Hs is defined recursively as follows: 

— l~Ll = ^2 

- Hs = {Hs-i X Hs-i) ■ Z 2 , for s > 1. 

The permutation representation Ts of the Z 2 appearing in Hs, contains two 
permutations of 2® elements, the identity t and the involution Tg, which swaps 
the two halves {0, 1, . . . , 2®“^ — 1} with {2®“^, . . . , 2® — 1}, each of length 2®“^. 
For example 71 = {[0, 1], [1,0]}, % = {[0,1,2, 3], [2,3,0, 1]}, etc. 

Example 2. Hs and Og for s = 1,2,3. 

Hi =Ti ={[0,1],[1,0]} |Hi|=22^-i = 2 

H 2 = {Hi X Hi)-T2 = {[0,1, 2, 3], [1,0, 2, 3], [0,1, 3, 2], [1,0, 3, 2], |%| = = 8 

[2, 3, 0,1], [2, 3, 1,0], [3, 2, 0,1], [3, 2, 1,0]} 

Hs = (H 2 X %) ■ 7i = {[0, 1, 2, 3, 4, 5, 6, 7], . . . [7, 6, 5, 4, 3, 2, 1, 0]} [Hs] = = 128 

Each Hg has a unique canonical basis Og which contains 2® — 1 blocks each 
consisting of two permutations. Each block Ai has one key bit position Ci = {i}. 
An as is constructed recursively as follows: 
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Here, 4 is defined as (0, 1, ... 2'’ - 1), 1^ = 2^ + h = (2", 2" + 1, . . . 2"+i - 1) and 
Js denotes a (2^*+^ — 2) x 2® array each row of which is equal to Ig- Analogously, 

Js = 2® + Jg. 

Definition 4. The compact representation of elements in T~Lg. 

Let h be a permutation from LLg. The binary vector aj^{h) = (xq, Xi, . . . 
w = 2^ — 1, is called the compact representation of h. 

Again, the compact representation is optimal in term of memory require- 
ments. In general, we can say that each h S TLg can be uniquely represented 
by a (2® — l)-bit binary number. The multiplication of permutations in TLg is 
a non-linear and non-commutative operation. It can be performed directly and 
efficiently in the compact representation [6]. 

As already mentioned, the preferred order of the carrier group should be 
a number in form 2®^, where k G N. However, the order of TLg is 2^ and 
2® — 1 yf 8k. Thus the real-world ciphers will be based on a group whose compact 
representation is one bit longer. This can simply be achieved by using a slightly 
modified group TLg x TLi instead of TLg. The compact representation grows by one 
bit to the desired 2® and the multiplication stays in principle the same as in TLg, 
only the highest bit must be handled (xOR-ed) separately. The multiplication 
operation continues to be non-commutative and non-linear. From now on we 
suppose that all permutations and all group bases are stored and manipulated 
only in the compact representation. 

We have presented the two most extreme examples for permutation 2-groups, 
the simplest TIf, which is commutative, and the most complex TLg x TLi. In 
principle any other 2-group can be used in an appropriate representation. New 2- 
groups for our cryptographic purposes can be constructed from the available ones 
by taking wreath products, direct products, extensions and their combinations [6] . 



3.2 Key Generation 

A key for our cryptosystem consists of two randomly chosen group bases. This 
approach ensures an extremely high upper limit of the scalable key space. Be- 
cause the bases can hardly be entered manually by the user, we need a mechanism 
for generating random bases. Possibly, in cases where a fixed key length is ex- 
pected, the bases could be generated from a binary key of fixed length or from 
a pass-phrase, in conjunction with the use of a pseudo-random number genera- 
tor, which in turn is based on a subsystem implementing a fixed version of our 
system. 

In general, not every basis enables a fast factorization. An efficient factor- 
ization algorithm is only known for so called transversal group bases [5], [6]. 
Therefore we want to generate only bases of this kind. The Basis Generation 
Algorithm (bga) starts from the canonical basis a and carries out the following 
four steps: 
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1. The commutative block shuffle operation randomly changes the order of 

the blocks by multiple swaps of two adjacent blocks. Two blocks Bi = 
{bifi, . . . and = (6*+i,o, • ■ ■ can be swapped only if bij * 

6j+pfc = bi+i,k * bij for j e Z^. and k € 

2. The block fusion operation replaces two randomly chosen, adjacent blocks 

Bi = {bifi, . . . ,bi^n-i) having the set of key bit positions Ci and Bj = 
{bjfi, . . . ,bj^rj-i), j = * + 1, having the key bit positions Cj by a single 
longer block B'^ = Bi x Bj = * bj^n '■ m G Z^-, n € Z^^) having the 

key bit positions c' = Ci[Jcj. Note that block fusion changes the type of 
the basis from r = (ro,ri, . . . , to r' = (ro,ri, 

ri+ 2 ) ■ • ■ ) ■Ciu-i) and decreases the dimension of the basis from w to 
re — 1. 

3. The randomization operation replaces each bij G Bi^ i G {1, 2, . . . , u> — 1}, 

j G Z^, by b[ j = b,j * ni=o where 1^ G Z^,^ is chosen randomly for 

every combination of i, j and k. 

4. The element shuffle operation randomly changes the order of the elements 
within each block. 

Each step can be skipped or carried out several times, li (3 G Bq then each ffl 
generated from the (3 by any combination of these steps is also in Bq- Moreover, 
BGA preserves transversality, so all bases generated from the transversal a will 
enable a fast factorization. For instance, the basis /?2 in Fig. 2 was created 
from (3\ by a block shuffle (the axes Bq and B\ are exchanged) and an element 
shuffle (the indices of the points on each axis are shuffled). Block fusion and 
randomization were not applied there. 

The complete key generation scheme from the pass-phrase to the pair of 
group bases might look as shown in Fig. 3. 



r 




Fig. 3. Key generation 



A fc-bit hash value is extracted from the pass-phrase which was entered by 
the user. For example a Cyclic Redundancy Code with a primitive polynomial 
of degree A: -I- 1 might be used for obtaining the fc-bit hash. Optionally, the k- 
bit binary key K can be generated or entered directly. The length of K is freely 
scalable, theoretically up to several tens of thousands of bits. In practice, lengths 
of about 64 to 256 bits will be used. Key K is passed as a seed to a pseudo- 
random number generator which delivers pseudo-random numbers to the BGA. 
The generator prng is a sensitive part of an implementation and must be chosen 
very carefully. (See also Sect. 4.1.) 
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3.3 Fast Factorization 

Suppose G is a 2-group, |G| = 2", and f3 = (i?o, ■ • • , -Bju-i) a transversal, w- 
dimensional basis of G of type r = (rg, . . . ru,_i). Each block Bi = {bi ^, . . . , 
hi^n-i) contains permutations, where = 2™C The set Ci = {ci_i, . . . , Ci^rni}^ 
contains the key bit positions for Bi. Let KBi : 2 " — > 2 ™* be a function 
which extracts the key bits from a binary vector, KBi{ao , . . . , a„_i) = {aa i , • ■ • , 

The factorization of a permutation p £ G is performed level-wise. First, the 
highest coordinate x^j-i is obtained from = P as described below, then the 
intermediate result Pw-i = Pw * b~-i x _ is passed to the lower level and the 
process continues in the same way until the lowest level, where an xq is obtained 
and po = Pi * b^].^ is equal to the identity permutation in G. 

Let Pi = (tto, . . . ,a„_i) £ 2 " be an input to a factorization step. The index 
Xi-i is obtained as Xi-i = Fi-x{KBi-i{pi)), where Fi : 2™' — >• 2™’ is a 
bijection such that Fi{k) = j if and only if KBi{bi^j) = k. 

One should remark that although there is a similarity between the factoriza- 
tion with respect to a transversal group basis and the Schreier-Sims algorithm 
working on strong generating sets, they are not equivalent. The concept of group 
basis is more general than the strong generating set. For comparison see the 
works [4] and [5]. 

3.4 Extensions of the Basic System 

The cryptosystem introduced in Definition 2 demonstrates the basic principle 
of encryption based on group bases, the mapping of elements from one basis to 
another. However, even if we use a non-commutative carrier group with multi- 
dimensional bases, the cryptographic properties of this mapping will not be 
sufficient. In the following we present two effective techniques which extend the 
basic setup and improve the confusion as well as the diffusion [7] of the cipher. 

Bit Reversing. During encryption a permutation p £ G is factorized with 
respect to the first basis /3 and the resulting coordinates Xi are passed to the 
composition in the second basis (3' . Let us suppose for a moment, that both (3 
and f3' are of the same type, this makes the problem more obvious. 

Because both bases are randomized, we can consider each factorization and 
composition level as a kind of an S-Box, which increases the confusion. As shown 
on the left side of Fig. 4, each of the indices Xi has been influenced by a different 
number of S-Boxes. While xq passed all eight, x^ went only through two S- 
Boxes. That means that some parts of the information contained in p have 
been “scrambled” much less than other ones. This is an undesirable property, 
because “parts of the information” are not fully protected. Moreover, if G is a 
commutative group (such as Z 2 ), a large part of the ciphertext will not depend 
on X 3 at all. So diffusion is also reduced. 

For this reason, we propose a bit reversing of the index vector x before the 
start of composition. Note that bit reversing is better than a simple index vector 
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component reversing, because the bases are not necessarily of the same type. 
The new encryption and decryption functions are: 

y = efe(a:) = A"^(/32(i?(/3i \a(x))))) 

X = dkiy) = ^A(y))))) 

where the function R : 2” — 2" reverses the order of the bits of a binary 
vector. ^i, ■ ■ • , ^n-i) = , ^o), h € {0, 1}. The effect of R is 

illustrated on the right hand side of Fig. 4. 
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Fig. 4. Effect of bit reversing 



Now every index passes exactly five S-Boxes resulting in a balanced confusion 
of all components. The length of the index vectors x = Xo||xi|| . . . ||xu,j_i and 
y = 2/0II2/1II • ■ • \\yw2-1 is the same |x| = |y| = n, even if the bases /3 and j3' are 
not of the same type. So bit reversing can be used in this case as well. 

Non-linear Diffusive Transformation. At each factorization level the input 
Pi is divided by a factor from the current basis block. Only a small 

part, namely the key bits, of pi determines which factor will be taken. Because 
multiplication and division of permutations in the compact form of Z 2 are defined 
as a simple bit-wise XOR, a change of a single non-key bit of pi affects only a 
single bit oipi-i. Consequently diffusion at each factorization step is weak. Even 
worse, factorization in Z 2 is a linear function. Although factorization in Rs is 
not linear and its diffusion is the best among all 2-groups, it is still not strong 
enough from the cryptographic point of view. This is because the higher order 
bits of a product depend only on the higher order bits of multiplicands. 

Fortunately, both the weak diffusion and linearity can be compensated by a 
simple extension of group bases. Figure 5 shows the idea on a geometric analogy 
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to the group bases of Basis A was obtained from the canonical 04 by two 
fusing Bq with Bi and B2 with B3. An element shuffle of A resulted in basis 
B and applying a further randomization created C. In all three cases the factor 
Bi[xi] of any point depends only on its vertical position, the horizontal position 
does not play any role. The points having the same factors in B\ lie on parallel 
horizontal lines (surfaces). 




Fig. 5. Linear and non-linear bases 



However, one can also construct other, more complex bases. For instance, 
in basis D, the factor Bi[xi] depends on both the vertical and the horizontal 
position of a point. The lines, connecting the points having the same factor in Bi , 
are no longer horizontal. Moreover, in bases E and F the surfaces are not even 
linear. This can be seen as a generalization from an orthogonal two-dimensional 
coordinate system to a more general geometric coordinatization. (e.g. radial etc.) 

Translated back to group bases, before each factorization step the key bits 
of an intermediate result Pi (as defined in Sect. 3.3) will be made dependent on 
all bits of Pi . A non-linear hash function Ti : 2" — >• from n to bits will 

be used for that purpose. Let Ci = {ci,i, . . . , be a set of key bit positions 

for block Bi and let SBi : 2” x 2"** — > 2"' be a function which sets the key 
bits of a binary vector to a specified value, SBi{ (oq, . . . , «n-i), (di, • ■ ■ , drm) ) = 
(co, . . . , e„_i), such that ej = Uj for all j ^ Ci and Cc^ ^ = du for all fc = 1 , . . . , 

A preprocessing step will be then defined as follows: p'i = SBi_i{pi,T{pi)). The 
definition of the factorization step is the same as in Sect. 3.3, with the exception 
that the transformed value p' is processed instead of the original p^. 

In the cases A, H, C of our example, KB\ : 2 '^ — >■ 2^ extracted two key bits 
from a d-bit word (compact representation of a p € Z2), iCi?i(ao, oi, 02, as) = 
(02,03), and Fi : 2^ — > 2^ found the appropriate factor in Bi, ^1(02, 03) = Xi. 
In the extended version (cases D, E, F) x\ depends on all four bits of p. In case 
C, Ti : 2^ — >■ 2^ is defined by Ti(oq, 01,02,03) = (oq, oi) -I- (02, 03), in case D 
by Ti{ao, 01,02,03) = (oq, oi) XOR (02, 03) and in case E as Ti(oo, 01,02, 03) = 
rotl(oq,oi) -I- (02,03). Many other functions are also possible. In general, the 
hash function T should possess at least the following four properties: 

— each bit of the output should be dependent on each bit of the input 

— each output bit should be balanced^ 

^ An output bit is balanced if no = ni, where Ua is the number of inputs for which 
the output bit is equal to a 
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— the function should not be linear 

— composite function SB(p,T(j))) must be invertible 

However, one can also use stricter criteria, similar to those used in the construc- 
tion of an n X rrii S-Box [1]. When using a proper T, the avalanche effect in 
the cipher will be very strong, because every single factorization or composition 
step ensures that the avalanche criterion is fulfilled. In [11] the author defines 
the term excess avalanche factor for iterative ciphers. If we think of every fac- 
torization or composition step of our non-iterative ciphers as a “round”, then 
the value of an analogue of the EAF will be equal to wi +W 2 , where Wi are the 
dimensions of used bases. 

4 Security Aspects 

The security, speed and memory requirements of our ciphers depend strongly on 
the concrete configuration. The most important parameters are: 

— the order of the carrier group (affects the block length), 

— the extent of block fusion (affects the size of “S-Boxes” and the number of 
“rounds”), 

— the function T used, 

— and the randomness of the group bases. 

By implementations, where i) the carrier group has large order (e.g. 2^^®), 

ii) the extent of block fusion is reasonably large (e.g. 12 key bits per block), 

iii) a sensible non-linear T was chosen and iv) the bases were generated directly 
from some physical source of “true” random numbers, a high degree of security 
is ensured. 



4.1 The Pseudo- Random Number Generator 

A key in our cryptosystem consists of two secret group bases, alternatively speak- 
ing, of two sets of several, large, key dependent S-Boxes of special structure. As 
the algorithm itself is simple and public, a possible attack would try to recon- 
struct the bases, using a chosen plaintext attack or similar techniques. 

If an implementation uses a prng for generating the bases, the properties of 
the PRNG are crucial for the security of the cipher. The number of possible initial 
states of the generator, i.e. the size of the generator’s seed, must be reasonably 
high, because it directly bounds the real key space of the cryptosystem, which 
must be exhausted in a brute force attack. Of course, we cannot use a simple 
32-bit linear congruential generator, unless we want to construct a weak cipher. 
The size of the generator’s seed is just one of the many measures that determine 
the quality of the prng from a cryptographic point of view. The prng needs 
to pass non-trivial randomness tests, like the Maurer Test [8] , the Diehard suite 
[9], etc. Otherwise some attacks based on dependencies within the bases might 
be possible. 
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In our opinion, the lagged Fibonacci generator with Liischer’s approach [10] 
is a proper example of an acceptable prng. For instance, using lags (37,100) 
with a word length of 32 bits, the generator passes all statistical tests, the size of 
its seed can be scaled up to 3200 bits and the period of the generated sequence 
is 2^^^. These values can be further improved by changing the lags. 

Finally, a comment should be made about a brute force attack on a cipher 
using EGA. The time needed for generating bases (usually less than one second) 
is negligible for the legal user, who generates the key once, but it is a big problem 
for an attacker, who tries all possible keys. When trying, say, 2®^ different keys, 
with a delay of 500 ms per key, a brute-force attack is infeasible. 

4.2 Block Fusion 

The average length of blocks is also very important for the security of the system. 
Let 2" be the order of the group G and let a; be a divisor of n. When EGA merges 
X adjacent blocks fe-x, k-x + 1 , . . . , k-x + x— 1, of fc G [0, ^ — 1], we say that a 
block fusion to extent x was performed, (x is equal to the number of key bits per 
block.) The number of adjacent blocks merged needs not necessarily be constant 
for all fused x-tuples. In this case the average fusion extent can be computed by 
X = — , where w is the dimension of the basis after block fusion. 

For instance the canonical basis ag 4 has 64 blocks with two permutations in 
each block. Each permutation in the compact form is 64 bits long, so the whole 
basis fits in 1 KB of memory. If we perform a fusion to extent 4, we obtain a basis 
with 16 blocks of 16 permutations (2 KB), a fusion to extent 8 creates 8 blocks of 
256 permutations (16 KB), etc. The fusion to extent 64, would result in one block 
of 2®^ permutations (2^^ TB). Of course, the higher the extent of block fusion, 
the more secure the cipher. In the extreme case x = n we obtain a full random 
permutation of 2" elements, which is the strongest n-bit cipher available. On 
the other hand, the memory requirements are growing exponentially with x and 
the quality of the prng becomes more critical. The more PRNs are generated, 
the higher the probability, that some weakness of the prng might be exploited. 
For the reasons above a tradeoff between security and memory load must be 
found. The values between 8 and 16 key bits per block would be appropriate for 
practical use. 

4.3 Simplified Variants 

For speed optimization one could use some simplified variants from our general 
family of block ciphers, e.g. by taking simple T, using fixed key bits positions or 
fixed base-block size. However, one should be careful about it, because some of 
these simplifications might compromise the security. For example, it is not clear 
whether it is secure to use a EGA without block shuffling in combination with the 
7j2 carrier group. The fixed key bits positions enable faster implementation, but 
given a concrete transformation T it might be possible to construct a differential 
which passes w — \ factorization steps. Consequently a differential attack might 
be possible. However, to be able to break the cipher an attacker would have to 
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completely reconstruct both secret group bases which are several kilobytes long. 
This does not seem to be a practical attack even for a simplified cipher. 



5 Experimental Results 

We implemented a scalable software version of the proposed algorithm. We used 
two carrier groups, the 'Hs x Hi, supporting block lengths 8, 16, 32, 64, 128 and 
256 bits, as well as the Z 2 , supporting all block lengths from 32 to 512 bits, 
divisible by 32. We used a fixed key length of 128 bits given by the number of 
possible initial states of the prng. 



5.1 Throughput 

Both algorithms were implemented in C++ and tested on a Pentium II machine 
running at 350 MHz. As expected, the first carrier group was less suitable for a 
software implementation. The multiplication of permutations from Hs x Hi in 
the compact representation is a bit-oriented recursive algorithm not very well 
supported by the instruction set of the processors. To make the factorization 
faster, we precomputed the inverses of all permutations in the group bases. So 
we actually stored four instead of two bases. The required memory space was 
about 90 KB. The throughput of the 6+bit version without transformation T was 
about 75 KB/s and with a simple transformation only 50 KB/s. When we used 
the Cartesian representation of permutations, speeds rose to 275 KB/s without 
a T, and 100 KB/s with a transformation. The memory requirements were about 
900 KB. Even if some tighter optimization techniques were to improve the speeds 
by a factor of 2 to 4, the values achieved by the software implementation can not 
be considered as very satisfactory. A simplified hardware version of the algorithm 
with its own special multipliers running at a clock rate of 45 MHz achieves speeds 
above 20 MB/s according to [6], so the group Hs x Hi is definitely more suitable 
for a hardware implementation. 

Our second implementation used Z 2 as carrier group. The multiplication of 
permutations in this commutative group is much faster than in Hs x Hi. We used 
a simplified EGA without block shuffle and with the fixed fusion length 8 blocks, 
which made the factorization even more efficient. A non-linear transformation 
T was used at each level of the factorization and composition operations. The 
64-bit version occupied 18 KB and encrypted at a rate of 2 MB per second. 
The 128-bit version with memory requirements 69 KB achieved about 1.5 MB/s 
and the 256-bit version ran at 1 MB/s occupying 270 KB of memory. Again 
some speed improvements by a factor of 2 to 3 might be possible after a strong 
optimization effort. These results confirmed that Z 2 is much more efficient than 
Hs X Hi, at least in software. 

Unfortunately, the encryption speeds measured by Z 2 are approximately 2 to 
7 times slower than the speeds of recent fast block ciphers. Even if we consider 
some minor possible optimizations, the achieved speeds are not satisfactory. 
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5.2 Randomness 

We used a general statistical approach to estimate the quality of encryption. 
Similar methods have already been used in several works, for example in [15]. 
Of course, this approach can not replace a deep analysis of the cipher, but it at 
least gives us a good estimate of cipher’s quality. In our tests we encrypted a 
large amount of highly redundant non-periodic data (e.g. a sequence of blocks, 
containing n-bit binary representation of a counter sequence 0, 1, 2, ...), and 
tested the output for randomness. The idea of the testing approach is the fol- 
lowing: The cipher must provide strong diffusion, so even small changes between 
the adjacent input blocks must result in big and random looking changes in the 
output blocks. Further, the cipher must provide a strong confusion, so a system- 
atic and highly redundant input sequence must be encrypted into a sequence 
which can not be distinguished from a true random one by any statistical tests. 
The output sequences were tested by the DieHard suite of statistical tests [9]. 
The tests were carried out for many different keys. 

The data were encrypted using the carrier group Z 2 and the bases were gen- 
erated with the lagged (37, 100) Fibonacci Generator with Liitscher’s approach. 
The fixed number of key bits per block was set to 8. Here is a C-like definition 
of one of the non-linear transformations used, T : 2" — > 2®, n = 8k, k G N: 

byte T (vector p, int n) 
byte sum = 0 ; 
for i=0ton/8-l 

sum = rotr3(sum + p[i]); 
return sum; 

The p [i] are the 8-bit segments (bytes) of the n-bit binary vector p and the 
function rotr3 performs a 3-bit right rotation of an 8-bit value. 

Each test from the DieHard suite evaluates the quality of the input with a 
so called p-value, p € [0,1]. Good results should lie between 0.001 and .999. 
The tests with results below 10“® or above 1 — 10“® are considered as failed. 
However, one must keep in mind that even a true random number generator 
generates sometimes a sequence, which “fails” the test, since all sequences, even 
the “less random” ones, appear with the same probability. 

We carried out 4400 tests for each configuration and counted the number 
of significant results among these tests. A result was considered as significant 
(or suspect), if the value p was below 0.001 or above 0.999. The average ratio 
of significant results, measured by our cipher, was 0.0024 for block length 64 
bits and 0.0025 for 128 bits. In our opinion the results can be considered as 
satisfactory. For instance, the cipher IDEA, which is regarded as one of the most 
secure 64-bit ciphers today, achieved on average 0.0029 of significant results by 
the same test. An output of a simple linear congruential generator produced 
0.5036 of significant results. 
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6 Conclusions 

We have introduced a new framework for constructing block ciphers based on 
group bases. Our approach enables to design a simple weak cipher, which can 
be deeply analyzed and examined, as well as a large, strong one. Even the full 
symmetric group of degree 2" can be realized from the same specification. 

In contrast to Feistel networks, our ciphers are not iterative. Instead of several 
repetitions of a uniform round, a specific number of factorization and composi- 
tion steps are carried out. The security can be scaled through the average fusion 
extent instead of the number of rounds. As the key of the cipher is in fact a ran- 
dom pair of group bases, the potential size of the key space is much larger than by 
the “classical” ciphers. In practice the group bases will be generated from some 
fixed-size seed value, so they can be viewed as a set of random key-dependent 
S-Boxes with a special structure. 

The block length, key length and security level of the ciphers are scalable. 
Some other components of the cipher, which affect the speed and memory re- 
quirements, are also variable. The system has been optimized for maximal con- 
fusion, diffusion and non-linearity. The results of statistical tests were very satis- 
factory. Nevertheless, the cryptosystem is still too new to allow us to make strong 
statements about its security. Some attacks based on the special structure of the 
group bases may be possible as well as attacks targeting the special properties 
of used PRNG. The presence of a mathematical foundation lets us hope that a 
deeper theoretical analysis of the cryptosystem will be possible. 

There are still some open questions about the new design, for instance: 

— Compared to the fast modern block ciphers the proposed ciphers are rather 
slow. Is it possible to significantly improve their speed? 

— Although the general design appears to be very robust, it is not clear whether 
it is also true for the simplified variants (e.g. using fixed key bits positions, 
fixed base block length, etc.). How is their resistance to the differential and 
the linear cryptanalysis? 

— What is the minimal block fusion extent, which provides a strong security? 

Research on these problems is likely to be the subject of some future work 
on this area. 
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Abstract. Koblitz, Solinas, and others investigated a family of elliptic 
curves which admit faster cryptosystem computations. In this paper, we 
generalize their ideas to hyperelliptic curves of genus 2. We consider the 
following two hyperelliptic curves Ca '■ + uv = + a + 1 defined 

over F 2 with a = 0, 1, and show how to speed up the arithmetic in the 
Jacobian 3ca (1^2") by making use of the Frobenius automorphism. With 
two precomputations, we are able to obtain a speed-up by a factor of 5.5 
compared to the generic double-and-add- method in the Jacobian. If we 
allow 6 precomputations, we are even able to speed up by a factor of 7. 



1 Introduction 

Public-key cryptosystems based on the discrete logarithm problem on elliptic 
curves over finite fields have been invented by Neal Koblitz [9] and Victor Miller 
[17]. Elliptic curve cryptosystems became a popular choice for implementations. 
The most important operation in an elliptic curve based cryptosystem is the 
computation of m-folds with a positive integer m. That means computing mP 
for a point P on an elliptic curve. For example, the complexity of the ElGamal 
encryption scheme [3] and the Diffie-Hellmann key agreement protocol [2] on an 
elliptic curve both depend mostly on the complexity of computing m-folds. 

The standard method for computing m-folds in a group G is the double- 
and- add-method. If P is an element of G and m a positive integer, doubles and 
additions are performed with respect to the binary representation of m requir- 
ing about log 2 (m) doubles and log2(m)/2 additions on average. Assuming that 
doubles and additions have about the same complexity, this method requires 
3 log 2 (m) /2 group operations. Allowing precomputations and using memory, var- 
ious techniques apply to speed up the double-and-add-method (see [8]). 

In [11,12,23,15,24], a family of elliptic curves was investigated which allows 
to speed up the scalar multiplication considerably with the help of the Frobenius 
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automorphism. They considered the elliptic curves E \ v?' + uv = + av"^ + 1 

defined over F2 with base field F2", which are called elliptic Koblitz curves. 

The fastest known attack to the elliptic curve discrete logarithm problem is 
the parallelized Pollard’s rho method [20,22,27]. As noticed in [5,28], the attack 
time to these curves can be reduced by a factor of \/2n which causes one to 
select slightly larger secure key parameters. 

Hyperelliptic curve cryptosystems have been introduced by Neal Koblitz [10] 
in 1989. Cantor’s algorithm [1] provides an effective algorithm for performing 
the group law in the Jacobian of a hyperelliptic curve. 

In this paper, we generalize the ideas for elliptic Koblitz curves to hyperellip- 
tic curves of genus 2. We concentrate on the following two hyperelliptic curves 

Ca '■ + uv = + au^ + 1 (a = 0, 1) , 

which are defined over F2 and have the base field F2r. where n is prime. These 
curves are generalized Koblitz curves of genus 2 and are twists of each other. 
Furthermore, they are the only non-supersingular curves mentioned in [10] and 
thus resist the Frey-Riick-attack [4]. 

We want to point out that the curves Ca have two major advantages. Firstly, 
the cardinality of the Jacobian of Ca, can be easily determined for 

any n, whereas in general computing for a random hyperelliptic curve 

over F2" of genus 2 appears to be difficult. Secondly, we can use the Frobenius 
automorphism to eliminate many cryptosystem operations. On the cost of two 
precomputations, we are able to speed up the computation of m-folds by a factor 
of 5.5. With 6 precomputations, we even obtain a speed-up by a factor of 7. On 
the other side, a generalization of the methods in [5,28] shows that one can speed 
up the attack to hyperelliptic cryptosystems by a factor of V^, if the curve has 
an automorphism of order n (see [7]). Since the curves Ca have at least an 
automorphism of order n, namely the Frobenius automorphism, the attack to 
cryptosystems based on the discrete logarithm in Jcq( 1F2") can be sped up by a 
factor of -\/^. As in the case of an elliptic curve, one then has to adjust the size of 
the key space marginally. Most of the results can be easily generalized to all other 
genus 2 hyperelliptic curves and, more general, to curves of arbitrary genus [13]. 
However, for arbitrary curves, one has to be careful with the parameter choice 
for g and n. Results in [6,7] seem to suggest that hyperelliptic curves of genus 
5 > 4 are not as secure as elliptic curves. 

2 Hyperelliptic Curves 

2.1 Basic Definitions 

For details on hyperelliptic curves we refer to [10,16,1,26]. Let F^ be a finite 
field and Fg its algebraic closure. A non-singular hyperelliptic curve of genus g 
is defined by the equation 

C ■. + h{u)v = f{u) in Fg[M,u] , 



( 2 . 1 ) 
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where h{u),f{u) G deg„(h) J g, f{u) monic, deg„(/) = 2g + 1, and if 

+ h{x)y = f{x) for (x, y) € Fg x Fg, then 2y+ h(x) ^ 0 or h'(x)y— f'{x) ^ 0. 

Let Fgn be a subfield of Fg containing Fg. The set of Fgn -points P on C is 
given by C(Fgn) = {(a;,?/) G F^„ | y^ + h{x)y = f{x)} U {oo}, where oo denotes 
the point at infinity. For a Fg»* -point P = (x,y) G Fg„, the opposite P of P is 
immediately given by P = {x, —y — h{x)). For P = oo define P = oo. 

A divisor on C is a finite formal sum D = 'Y^pUipP, where mp are inte- 
gers that are 0 for almost all P. Then the degree of D is defined by degP = 
^pTTip. D is said to be defined over Fgn, if^ = ^pVTipP"^ = D for any 
a G Ant (Fg/Fgn). The set ]D)c(Fgn) of divisors of C defined over Fgn forms an 
additive group which contains the finite subgroup Dp(Fgn) of all degree zero di- 
visors of ID) defined over Fgn. The greatest common divisor of D\ = X^PeC '^pP 
and D 2 = J2p^c PpP defined by 

gcd(Pi,P2) = E min(mp,np)P— ( min(TOp, np) ) 00 . 



Furthermore, the divisor of a polynomial G{u,v) G Fg[u,ti] is defined by 
div(G(u,p)) = ^pOrdp(G)P — ^pOrdp(G)oo, where ordp(G) is the order of 
vanishing of G(u, v) at P. Now, the divisor of a rational function G{u, v)/H{u, v) 
is called a principal divisor and is defined by div(G(it, z;)/P(m, i;)) = div(G(u, 
z;)) — div(iJ(it, n)). We denote by Pc(Fgn) the group of principal divisors. Since 
every principal divisor has degree 0, Pc (Fgn) is a subgroup of (Fgn). Finally, 
the Jacobian of C over Fgn is given by 



2.2 Reduced Divisors 

Let Gy be the set G — { 00 } of finite points on G. A degree zero divisor D = 
J2p£Cf PPpP~ ( J2p£Cf PT-p)^ is called semi-reduced, if it satisfies the following 
conditions for each P G Cf. 

(i) Pip > 0. 

(ii) If P yf P and mp > 0, then mp = 0. 

(iii) If P = P and mp > 0, then mp = 1. 

A semi-reduced divisor is called reduced, if in addition 



Reduced divisors have the crucial property (see [19]) that for each degree zero 
divisor D there exists a unique reduced divisor such that D — Dr is a principal 

divisor. That means, the set of reduced divisors of G forms a complete system of 

^ P°^ denotes {<j{x),o{y)), if P = {x,y) G F^n, and 00 , if P = 00 



PeC 



PeC 



Jc(Fgn)=D^(Fgn)/Pc(Fgn) . 




PeC/ 
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representatives for the divisor classes of C. Semi-reduced divisors can be repre- 
sented uniquely in an advantageous way (see [16,19]): Let D = ~ 

{'Yhp^Cf'^p)^ ^ semi-reduced divisor which is defined over F^n. We put 

= npGC/(“~ G where P = (xp,yp). Then there exists a 

unique polynomial b{u) € Fqn[u] such that U = gcd(div(a(M)), (div(6(u) — v)) 
and 

1. deg„6 < deg„a, 

2. b(xp) = yp for each P £ Cf with mp yf 0, 

3. a{u) divides b{u)^ + b{u)h{u) — f{u). 

In this case, we write D = [o(m), 6(u)j. It follows that every element D of 
can be uniquely represented by two polynomials a(u),b(u) G Fqn[?r], where a{u) 
is monk, deg6(u) < dega(u) < g, and a{u) divides 6(u)^ -I- b{u)h{u) — f{u). We 
notice that operations in the Jacobian can be performed by using the arithmetic 
in Fqn[Mj. Without explaining the algorithms here, we mention that there exists 
an effective method to add two elements of the Jacobian which is known as 
Cantor’s algorithm. For details we refer to [1,10,16]. The generic operation needs 
17g^ -I- 0{g) operations in F^n whereas doubling needs 16^^ -I- 0{g) operations in 
F,jn (see [25])^. So, we can assume that both operations have roughly the same 
complexity. It is important to note that inversion is basically for free, since the 
negative of U = [a(u),6(u)] is given by —D = [a{u),—h{u) — b{u)]. 

2.3 Frobenius Automorphism 

The Frobenius automorphism <f> : F^ — > F^, x i — > x‘^ extends to an automor- 
phism on the Jacobian of C. Namely, we put P^ = (a;‘?,y«) for P = {x,y) G 
¥q X Fq, and oo^ = oo. For a divisor D = J^Pec'^’^pP ^ define D'^ to 
be Y.p^c'^pP*- A very important property of the Frobenius action on semi- 
reduced divisors is provided by the following 

Theorem 2.1. Let C \ v'^ + h{u)v = f{u) be a hyperelliptic curve of genus g, 
where h{u),f{u) G Fg[u]. If D is a semi-reduced divisor of C which is defined 
over Fqn , then the divisor D'^ is semi-reduced and defined over F^™ . In particular, 
if D = [a(u),6(rt)] with a{u),b{u) G Fgn[it], then we have 

= [a{u)^, 5(u)^] . 

Proof. Let D = J^p^Cf 'iPpP~ ( Thp^Cj ipp)oo be a semi-reduced divisor which 
is defined over F^n. Since </> is an automorphism, each property, that has to 
satisfy for being semi-reduced, follows from the condition that D is semi-reduced. 
(j) commutes with any a G Aut(Fg/Fqn). Therefore, we have {D'i’Y = = 

for any a G Aut(Fg/Fq™). That means is defined over F^n. Now, let 
= ripeC/(^ ~ xp)'^^ G F,jn[u] and let b{u) be the unique polynomial 
satisfying a)-c) in Section 2.2 such that 

^ We remark that there exist even faster methods if the characteristic of F^n is 2 and 
if we use normal basis representation for elements in 
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D = gcd(div(a(u)), div(6(u) — v)) = [a(M),6(u)] , 

Then 

^ ^ mp)oo = gcd(div(a(it)),div(6(M) — r;)) = [a(M),6(M)] , 

PGCf PeCf 

where a{u) = ripeC/(^ ~ G IFqn[M], and b(u) G Fqn[u] is the unique 

polynomial satisfying a)-c) in Section 2.2 for Clearly, a'^(u) = ripGC/(''^ ~ 
= a(u). It remains to show that b^(u) = h(u). Firstly, we have deg„ b'^ < 
deg^a"^ = deg a, since deg„ 6 < deg„a. Secondly, = ( 6 (xp))'^ = yf, 

for each P G Cf with mp ^ 0. Thirdly, a'^(u) divides {b{u)^ + b{u)h{u) — 
f{u))‘^ = h^{uY + h'^{u)h{u) — f{u), since a{u) divides b{u)^ + b{u)h{u) — f{u) 
and h{u),f{u) G Fq[M]. Since b{u) is unique with these three properties for 
and a{u) = a^{u), we must have b‘^{u) = b{u). 

An important consequence of this theorem is that if = [a(u),b(u)] is 
a reduced divisor representing an element of Jc(F 5 "), then the action of the 
Frobenius on D is given by = [a(it)'^, b{u)^]. Notice that this is only true 
since C is defined over the subfield F, of F^n. The interpretation is that if 

a{u) = X]?=o ^ ^' 9 " M b{u) = G F^n [it] are the explicit rep- 
resentations of a(ii) and b{u), then a’^(u) = = J2i=obiU^- 

The practical meaning of this observation is that if we use normal basis repre- 
sentation for elements in F^n, then a‘^(it) and b‘^{u) can be determined by simply 
shifting the normal basis representation of each coefficient and bi in order to 
compute D^. The complexity is therefore at most 2g cyclic shifts. These shift 
operations are basically “for free” when compared to the more expensive group 
operation in the Jacobian. 

3 Algorithms for + uv = a 1 

For the remainder of the paper, we consider the curves Ca '■ v'^ + uv = ~\~ 

au^ + 1 with a = 0, 1 which are defined over F 2 . From [10], we know that the 
characteristic equation of the Frobenius of the curve Ca is given by 

r^-k(-l)“r3-k(-l)“2T-k4 = 0 . (3.2) 

It follows that 

-4.D = + (-1)“<^3(D) + (-1)“2<^(ZJ) (3.3) 

for any D G JIcc(F 2 )- Here, (/>(P) := The equation (3.2) has four solutions 

n /2 = (-1)“+^(mi ± - Mi)/2 , T- 3/4 = (-l)“+i(/i 2 ±f\/4- /i 2)/2 , 

where /I 1/2 = (1 ± \/l7)/2. We put T = Ti and can regard t as the element cj) 
in the automorphism ring of JJci(F 2 )- As the roots for both curves are equal up 
to signs, it suffices to consider Ci. Analogous results hold true for the curve Co 
with some slight modifications. In particular, #Jco(®' 2 ") differs from #Jci(F 2 ") 
only for odd n. 
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3.1 Computing x-Adic Expansions 

We are interested in expansions like 11 = — — 2 t^ + 3, which enable us 
to compute 1\D by IIH = —cj)^{D) + — 2(j)^(D) + 3D. More generally, 

for a given integer m, we are interested in its r-adic expansion m = 
with coefficients Ci G R, where i? is a suitable subset of the integers. First, we 
consider R = {0, ±1, ±2, ±3}. In Sect. 5, we will vary the set R. 

Let z = a + br + ct'^ + dr^ be any element of Z[t] . Since r is a root of (3.2), 
z is divisible by r if and only if 4 | a as can be shown by direct computation. 
Therefore, we must have r | z — u for some u G {0,1, 2, 3}. It follows that 

z-u = + +CT+ (3-4) 

With R = (0, ±1, ±2, ±3} we are able to realize the strategy ” at least one of four 
consecutive coefficients is zero” when determining the cfs. The basic algorithm 
for computing r-adic expansions of z = a -|- -|- cr^ -|- G 2Z[t] is to choose an 
u G R such that 4 | z — u, to divide z — m by r and then to repeat these two steps 
with the new, replaced z' = ((a — u)/2-|-6)-|-cr-|-((a — ?i)/4-|-d)r^ — ((a — u)/4)r^, 
see (3.4), until the resulting z' will be zero. Then the sequence of those u’s will 
be the sequence of the coefficients cq, . . . , Ci_i G R we have searched for. We 
proceed as follows: 

1. If 4 I a, then r | z and we clearly use m = 0. 

2. If 4 { a, then since R = {0, ±1,±2, ±3} we have exactly two choices for u 
and we can try to make one of the subsequent a’s divisible by 4: 

(a) If 2 I 6, then there is exactly one u G R such that 4 | a — u and 4 | 
((a — m)/ 2 -I- 6), namely 



b mod 4\a mod 8 


1 2 3 5 6 7 


0 


~1 2 3 -3 -2 -1 


2 


-3-2-11 2 3 



Using these values for u, the actual u is non zero but the next one will 
be zero. 

(b) If 2 { 6, then we are only able to make the third successor of the actual 
a at the latest be divisible by 4 by using. 



d mod 2\a mod 8 


1 2 3 5 6 7 


T] 


1 2 3 -3 -2 -1 


1 


-3-2-11 2 3 



This strategy produces expansions m = X)i=o with coefficients Ci in 
R = {0, ±1, ±2, ±3}, where CjCi+iCi+ 2 Ci +3 = 0 (i € {0, . . . , Z - 4}). 

For an integer m, the expected length I of such an expansion is 21og2 \m\. 
Note that this is about twice as long as the binary expansion m = 'Yf bi2'^, where 
bi G {0, 1}. We will show in the following section how to reduce the length of 
the r-adic representation. 
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3.2 Reducing the Length of the Representation 

Since the order of the Frobenius is n, two automorphisms are the same, if 
G ~ Thus the corresponding r-adic expan- 

sions are equivalent. Therefore before expanding m, we reduce it modulo r" — 1 
in Z[t] to obtain a shorter representation [m] = multiplication- 

by-m-map. We look for an element M € ^[t] such that M = m mod r" — 1 
and the r-adic expansion of M is as short as possible, i.e. \M\ is as small as 
possible. 



Theorem 3.1. For any positive integers m and n, there exists an element M C 
Z[t] such that 

1. m = M mod r" — 1, 

2. 21og2 \M\ < n -I- 5. 

Proof. Let r = — 1) G Q(r). Then there exist ro, ri,r 2 , rs in Q such 

that r = Let Vi be the nearest integer to for z = 0, . . . , 3. We put 

V = M = m — v{t'^ — 1). Then m = M mod r” — 1. By using 

the identity 



|Mp = 

we derive that |M/(r" 
21og2 \M\ < n + 5. 



|m-z;(r"-l)|2 = 

— 1)P < 14. Since 





< (2”/2-hl)2, 



we derive that 



For any positive integers m and n, we are easily able to determine an element 
M G ^[r], satisfying m = M mod r" — 1 and having a r-adic expansion of 
length I ~ n. We call this representation the reduced r-adic expansion of m. In 
the automorphism ring of the Jacobian, we obtain for the multiplication-by-m 
map that [m] = . The algorithm to compute M from m is along the 

lines of the proof of Theorem 3.1. We therefore omit it. We remark here that we 
need to be able to find a representation of r" — 1 as r" — 1 = a -I- 6r -|- cr^ -|- dr^ 
with integers a, 6, c, d. The next section will solve this problem. Furthermore, we 
need to be able to compute multiplicative inverses in Q[r]. This can be done by 
the usual extended gcd for polynomials. 



3.3 Representing r" — 1 by a -|- br -|- cr^ -|- dr^ 

To compute a, b, c, d G Z such that r" — 1 = a-|-&r-|-cr^-|-dr^ is no difficult task. 
Let n be a positive integer. Suppose that r"“^ = an-i+bn-ir-GCn-ir'^ + dn-iT^ 
for unique integers a„_i, 6„_i, c„_i, d„_i, then 

r" = a„_iT -I- -|- -|- dn-ir'^ 

= —4dn-i + (on-i + 2d„_i)r -I- -|- (c„_i -I- d„_i)r^ , 

since = —4 -|- 2r -|- r^, and hence 

r" — 1 = — (4d„_i -I- 1) -I- {on-i 2dn-i)r -G -|- (c„_i -I- d„_i)r^. 

Starting with = 1, we can compute the integers a, b, c, d iteratively. 
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3.4 Computing m-Folds Using r-Adic Expansions 

We now present our main algorithm for computing m- folds in the Jacobian 
of the genus 2 curve Ci : + uv = vP + v? + I with base field Ejr.. Let 

D = [(a(u), 6(it))] G Jci(®' 2 "), where deg„ 6 < deg„ a < 2, and a{u) is monic. 
For instance, if deg„ a = 2, then a{u) = oq + a\u + v? and b{u) = + b\u with 

coefficients oq, oi, 6 q, 6i G F 2 n. If deg„ a <2, then we even need less coefficients 
for a{u) and b{u). We assume that the coefficients of a{u) and b{u) are represented 
with respect to a normal basis of F 2 " over F 2 . 

Algorithm 3.2. (Computing Scalar Multiples) 

INPUT : a{u), b{u) G F 2 « [rt] such that D = [a(u), 6(u)] G Jci (F 2 "); Cq, . . . , Ci-\ G 
{0,±1,±2,±3} with 



i-i 

m = ^ CiT* (mod r" — 1) . 

i=0 

OUTPUT: s{u),t{u) G F 2 i«[m] such that mD = [s(M),f(it)] G JJci(F 2 ")- 

1. Precompute 2D and 3D. 

2. H -h- ci^iD = [s{u)C{u)] ; 

3. For i from I — 2 downto 0 do: 

(a) H [s{u)^,t{u)’f>]; 

(h) If(ci H ^ H + CiD ■ 

4. Output(s(M), t(u)). 

Note that the operation H = is nothing else than cyclic shifting of 
at most 4 coefficients of s{u) and t(u), if s{u) and tiu) are represented with 
respect to a normal basis. In general, s{u) = sq + siu + t(u) = to + tiu with 
So, si, to, ti G F 2 ". Thus, = [sg + si M + tg + t^u], and the computation of 
needs four cyclic shifts of elements in F 2 ^. 

3.5 Computing the Cardinality of the Jacobian 

The main step in computing the cardinality of Jci(F 2 ") is to determine the 
characteristic polynomial of the Frobenius. Since Ci is defined over F 2 , this can 
be accomplished very easily. The Theorem of Weil then gives us immediately 
that 

4 

#Jc,(F 2™) = 11(1 - rf) = ((1 + 2^*) - (rp + T-)) ((1 + 2") - {r^ + r^)) . 

i=l 

It appears that an explicit formula for the cardinality of Jacobians over F 2 n can 
only be developed for supersingular curves (see [10]). For our curve, we have 
to proceed differently. A way of evaluating products of the form nl=i(l ~ 
where a,, i = 1, . . . ,r are the roots of an arbitrary polynomial, was suggested 
by Pierce [21] and Lehmer [14]. Our case is very special. We can exploit the 
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Table 1. Average length and density 



n 


average length 


average density 


n 


average length 


average density 


61 


62.38 


0.5460 


97 


98.34 


0.5437 


67 


68.36 


0.5458 


101 


102.36 


0.5433 


71 


72.38 


0.5455 


103 


104.31 


0.5429 


73 


74.35 


0.5449 


107 


108.33 


0.5434 


79 


80.33 


0.5445 


109 


110.34 


0.5424 


83 

89 


84.35 

90.32 


0.5440 

0.5441 


113 


114.35 


0.5427 



additional structure of P(T) to derive recursions of lower order and using integer 
arithmetic only. If we assume that + T 2 = An + iJiiBn, then we get for n> 2 
that T 1 +T 2 = (4:Bn-i—2An-2)+f^i{An-i+Bn-i—2Bn-2)- Equating Coefficients 
yields the following recursions. Put Aq = 2, Ai = 0, Bq = 0, and Bi = 1. For 
n > 2 we define An = 4B„_i — 2An-2 and Bn = An-i + Bn-i — 2Bn-2- Then 
we have 

#Jci(F2") = (1 + 2")2 - {2An + Bn){l + 2") + {Al + AnBn ~ 45^) . 

A similar approach leads to formulas for #JJco(®' 2 ")- 

4 Experimental Results 

This section contains three tables. Table 1 describes the length and the density 
of reduced r-adic expansions For each prime n € {61, . . . , 113}, we generated 
10000 random integers m in the range 0 < m < # 101 ( 1 ^ 2 ") and computed the 
reduced r-adic representation. If d denotes the number of the nonzero coefficients 
Ci, and I the length of the representation, the quotient d/l is its density. 

The value n-h | seems to be a good approximation for the expected length I of 
a reduced r-adic expansion. The asymptotic density (obtained by combinatorial 
means) is ~ 0.537. The experiments provide evidence that the density is 
approaching the expected bound, so that the number of nonzero coefficients Ci 
is approximately |)- Therefore, Algorithm 3.2 for computing multiples 

mD for D £ Jci(F 2 ") needs about |n additions of reduced divisors, while the 
shift operations are essentially for free. The double-and-add-method for JJci (F 2 ") 
needs about 2n doubles and n additions of reduced divisors, so that the r-adic 
method leads to a speed-up by a factor of 



489n ■ 

910 

In Table 2 and 3, respectively, we list examples for factorizations of #Jc'i 
(F 2 ") and #Jc’o(®' 2 '*)- We only considered the cases where n takes on prime 
values in the range 61 to 113 and where the cardinalities of Jacobians contain a 
large prime factor. 
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Table 2. Computing the cardinality of the Jacobian JJci (F 2 ") 



n 


#Jci(F2U 


61 


5316911976894487061973100640561324954 
= 2 ■ 2658455988447243530986550320280662477 


67 


21778071481105140023832236795388122729642 
= 2 ■ 3217 ■ 3384841697405212935006564624710619013 


97 


25108406941546737996390354885625124943376439570684227477754 
= 2 ■ 389 ■ 1747 ■ 18473392463868826910318794676754071940716909907019619 


103 


102844034832575383397207943835010553634640254575820398436091978 
= 2 ■ 47381 ■ 1085287719049570327739050925845914539948927360923370110769 


109 


421249166674228800251100330124945140261321879842750041189776992282 
= 2 ■ 2617 ■ 620764811 ■ 129651709107106280529021406475320711149271787278988543 


113 


107839786668602557431646595347682461521285605430038087099528386736762 
= 2 ■ 53919893334301278715823297673841230760642802715019043549764193368381 



Table 3. Computing the cardinality of the Jacobian Jco(F 2 ") 



n 


#Jco(F2-) 


67 


21778071484774983299499715182968742769496 




= 2® • 2722258935596872912437464397871092846187 


89 


383123885216451157219690382614340814499889612946264008 




= 2® • 179 • 1069 • 83091469 • 3012049244523553711515420284982459139979 



5 Improvements 

Following the idea of Koblitz [12], we modified our set of possible coefficients 
and used the set 



R' = {0, ±1, ±2, ±(1 + r), ±(1 - r), ±(1 - 2r), ±2 + r} 

as the domain of coefficients. Accepting the cost of 6 precomputations and storing 
these elements (instead of only 2 for set R), this choice enables us to realize 
a sparse r-adic expansion in the sense that no two consecutive coefficients are 
nonzero (cf. [24]). Using m as in the following table we force a + br + cT^ + d,T^ — u 
to be divisible by i. e. the next coefficient will be zero. If 4|a then u = 0, else 
take 



h mod 4\a mod 8 


1 2 


3 5 6 


7 


0 


1 2 


-(1 - 2r) 1 - 2r =2 


-1 


1 


1 -b r 2 -b r 


— (1-br) 1 — r — 2 -b r 


-(1 - r) 


2 


l-2r -2 


-112 


-(l-2r) 


3 


1 — r —2 -b r 


— (1 — r) 1-br 2-br 


-(1-br) 



By using this modified version of the r-adic expansion, the average length 
of the reduced r-adic representations was < n + 2 for an extension of degree n. 
The expected density for this set of coefficients is | ^ 0.42857. In Table 4, we 
present our experimental results. The generation of the integers m was identical 
to the one in Table 1. The difference lies in the choice of the set R' which yields 
new r-adic expansions. 
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Table 4. Average length and density 



n 


average length 


average density 


n 


average length 


average density 


61 


63.02 


0.4284 


97 


99.67 


0.4177 


67 


69.00 


0.4275 


101 


102.95 


0.4287 


71 


72.98 


0.4288 


103 


104.93 


0.4289 


73 


32.15 


0.4287 


107 


109.05 


0.4288 


79 


81.01 


0.4287 


109 


111.01 


0.4287 


83 

89 


84.99 

91.00 


0.4286 

0.4288 


113 


114.96 


0.4285 



Therefore with this set R' we obtain a speed-up by a factor of 



7 

with respect to the binary expansion on the cost of more storing and precompu- 
tations. 
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Abstract. In this paper, the complexity of a squaring operation using 
polynomial basis (PB) in a class of finite fields F 2 ”“ is evaluated. The 
main results are as follows: 

1. When the held is generated with an irreducible trinomial f{x) — 

a;"* + + 1, 1 ^ k ^ where both m and k are odd, a PB 

squaring operation requires ^ 2 ^ operations. 

2. When the held is generated with an irreducible trinomial f{x) — 

+a;*^ + l, 1 ^ k ^ ^ , where m + A: is odd and fc yf a PB 

squaring operation requires ^ ~ ^ bit operations. 

3. When the held is generated with an irreducible trinomial f(x) = 

+ x^ +1, a PB squaring operation requires ^ bit operations. 



1 Introduction 

Finite field arithmetic has recently been paid much attention mainly because its 
use in elliptic curve cryptography. In implementing an elliptic curve cryptosys- 
tem, a normal basis is usually utilized, because squaring operation in normal 
basis is only a cyclic shift of the element’s coefficients. A multiplication opera- 
tion can also be performed efficiently with an optimal normal basis (ONB) [5]. 
It has been shown that a bit-parallel multiplication in ¥ 2 ^ can be done in about 
2m^ ground field operations if a type-I ONB is chosen [2]. However, type-I ONB 
exists only in a small class of fields F 2 m where m is an even number. Moreover, 
it is more likely to have a comparatively efficient discrete elliptic curve loga- 
rithm when m is composite [4]. On the other hand, it has been shown that a 
bit-parallel multiplier using trinomial-based polynomial basis (TPB) has about 
the same complexity as that using a type-I ONB [3] , while irreducible trinomial 
over F 2 m exists much more prevailingly than type-I ONB. A squaring operation 
in TPB, however, is not free. 

In this short article, we derive the complexity of a bit-parallel squaring opera- 
tion using a TPB in F 2 m. It is shown to be of order 0{m) ground field operations 
(comparing to 2m? ground field operation needed for a bit-parallel multiplica- 
tion operation). If we try to solve an inverse in F 2 m using the method from 
Fermat theorem, then the complexity of m — 1 bit-parallel squaring operations 
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required is not greater than that of half bit-parallel multiplication operation. 
The time propagation of the hardware architecture of a bit-parallel squarer is 
also addressed. 

The main results include: When the field is generated with an irreducible 
trinomial /(x) = + 1, 1 < fc < then a PB squaring operation 

requires at most 

1. ^ 2 ^ addition, if both m and k are odd; 

2. ^ ~^2 ~ ^ bit operations, if m -|- fc is odd and fe yf 

3. ^ ~2 ^ bit operations, if fc = 

The organization of this paper is as follows: An brief introduction to PB 
squaring operation is given in Section 2. In Section 3, we present new complexity 
upper bound for PB squaring operation in a class of finite fields. Hardware 
bit-parallel implementation is addressed in Section 4. Finally, a few concluding 
remarks are given in Section 5. 

2 Polynomial Basis Squaring Operation 

Let f{x) be the irreducible polynomial over F 2 generating the field F 2 m. Let 

m— 1 

A{x) = ^ Uix’’ be the polynomial representation of an arbitrary element of 
2—0 

F 2 m. The squaring operation of A{x) is 

m— 1 

C{x) = ^ CiX* = A^{x) mod /(x) 

i=0 

= flo -I- oix^ -I- 02 x"‘ -I- ... -I- mod /(x). 

It can be seen that squaring in F 2 m is actually a case of polynomial modular 
reduction. Then the following corollary is obvious from the results on complexity 
of polynomial modular reduction [6] . 

Corollary 1. Let the field F 2 m be generated with the irreducible r-term polyno- 
mial /(x) of degree m. Then squaring a field element in parallel can be performed 
with at most (r — l)(m — 1) addition operations in F 2 . 

When /(x) is chosen as an irreducible trinomial, however, the complexity 
can be further reduced. 

3 Complexity Upper Bound for PB Squaring 

In this section, we assume that the field is generated with an irreducible trinomial 
/(x) = X™ + x^ + 1, I ^ k ^ Based on the the parity of m and k, the 
derivation is divided into the following three cases: 

1. Both m and 1 ^ fc < ^ are odd; 

2. m is odd and I < k < -ly is even; 

3. m is even and 1 < fc < ^ is odd. 



120 Huapeng Wu 



3.1 Both m and 1 ^ fc < ^ Are Odd 

Let 



m — 1 2m— 2 

A^{x) = ^ aix'^^ = ^ 

2—0 2 — 0 

where a', = ai \i i even, and 0 if z odd. Define 
^ 2 

m+2Z+l m—1 

a'y mod f{x) = Y 
2=^0 2 — 0 

for I = —1, 0,1,... , ^ 2 ^ ~ Then we have 

m—1 m—1 

Y = X! + a™+ 2 /+iX™+^'+^ mod f{x). (1) 

2 — 0 2—0 

The coefficient tf^’s have their initial values = a', and we try to solve the 

final values t\ ^ = Ci, i = 0, 1, . . . , m — 1. Note that t = 0 if i is an odd 

number. 

When / = 0, 

m—1 m—1 

Y = X! mod f{x) 

2—0 2—0 

m— 1 

= Y^ + a^_|_,(a; + x^^^) mod /(x). 

i=0 

Then we have 

{ o-'i + “m+ii * = fc + 1; 
o', i even, and i yf fc + 1; 

a'^+i, f = l; 

0, otherwise. 

Clearly, one bit addition is needed to compute from i = 0, 1, . . . , m — 1. 

In the following we will repeatedly use (1) for I = 1,2,... , ^ — 1. It will 
be seen that there are a few newly generated terms at each step. For example, 
when Z = 0 we have two newly generated terms a'^x and Note that 

fc + 1 is an even number and one bit operation is needed to take care of this even 
power term. In fact, one bit addition is always required if an even power term 
is generated, while one bit operation is probably needed if an odd power term is 
generated. This is because for some /, could be zero for some odd i. 

For I > 0 and I ^ ~ k _ order to keep fc + 2/ + 1 < m), we have 
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m— 1 



m—1 






77T.-1-2Z-1-1 



mod f{x) 



z=0 



1 : 4V = 

2^0 
m—1 

= H + x^) mod f{x) 



2=0 

m—1 



^2 X -h <^rn+2Z+l*^ ^ ^m+2Z+l‘^ 



fc+2Z+l 



mod f{x). 



i=0 



Obviously in this step (from Z — 1 to ^), one odd power term and one 

even power term are generated at the right side of the above equation. 

When I runs through from 0 to ^ ^ ^ — 1, the value of 2Z + 1 runs through the 
odd numbers from 1 to m — k — 1, and the value of fc + 2/ + 1 runs through the 
even numbers from A: + 1 to to — 1. 

Therefore, when 0 ^ ^ m — k _ .^g have 

A ^ + ®m+2i+l’ * = 2Z + /c + 1; 



= 



-''m+2i + l’ 

,d-i) 



f = 2Z+ 1; 

i even and * yf 2Z + fc + 1; or * = 1, 3, . . . , 2Z — 1; 
otherwise. 






i — 21 + k + l] 
i = 2l + l; 

i even and i 2/ + fc + 1; or i = 1, 3, . . . , 2/ — 1; 
0, otherwise. 

i = k + k + 3^ . . . ^ k + 21 + t] 






(/-i) 



_ J “m+i 

0 



j = 1, 3, . . . , 2Z + 1; 

i even and i yf fc + 1, fc + 3, . . . , k + 21 + 1; 
Otherwise. 



Thus for / = - 1, 



we can 



solve as follows 






+ al 

j 






m-k+i i = fc + 1, A + 3, . . . , TO - 1; 

1 = 1,3,... , TO — fc — 1; 

t = 0, 2, . . . , fc — 1; 

i = TO— fc+l,TO— fc+3,... ,TO — 2. 



In the following, we consider two cases: 

1. If fc = 1. 

When fc = 1, we have ^ T ^ — 1 = ^ T ^ — 1. Therefore, 






a'i + a'^-i+i * = 2,4, . . . ,to- 1; 
m+i t = 1, 3, . . . , TO 2; 

' i = 0. 



(2) 



It can be seen from the (2) that ^ 2 ^ bit additions are required for obtain- 
ing c,, i = 0, 1, . . . ,TO - 1. 
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2. If 1< /fc < ^. 

When ^ I ^ ^2^ ~ 

m—1 

Y1 + a'm+ 2 i+ix""^^^^^ mod f{x) 

i=0 
m—1 

+ a'm+ 2 i+iX^^""^[^ + x'"] mod f{x) 

i={) 
m—1 

Y + o!rr,+ 2 i+iX^'‘""^ + a'^+^i+^x^'‘+’^+^ mod /(x) 

i^O 
m—1 

Y + a'^+2i+iX^^^^ 

i^O 

+«:„+ 2 /+i^^'+'=+'-'"[l + mod fix) 

m—1 

_|_ ^2^+1 _|_ / 2l-\-k-\-l—m 

X -h «m+2Z+l‘^ "T ^m+2Z+l‘^ 

2 = 0 

mod fix) 

Since fc ^ ^ and / ^ ^2^ ~ have 2^ + 2fc + 1 — m ^ m — 2. It 

can be seen that there are two newly generated odd power terms and 

^ 21 +k+i-m-^ and one even power term (j;2i+2fe+i-m^ gi^gp^ When I runs 

through from ^ 2 ^ ^ 2 ^ ~ value of 2/ + 1 runs through the odd 

numbers from to— fe + ltom — 2, the value of 2^ + fc + 1 — to runs through 
the odd numbers from 1 to fc — 2, and the value of 2^ + 2fc + 1 — to runs 
through the even numbers from k + 1 to 2k — 2. 

Therefore, ^ can be given as follows 



m— 1 




2 = 0 




®m+2i+iJ i — 21 + 1; 

+ 0 ^+ 2 ;+!, i = 2Z + 2fc + 1 — TO, 2Z + fc + 1 — to; 

^ z even and i yf fc + 1, fc + 3, . . . , 21 + 2fc — 1 — to; 

or i odd and z = 1, 3, . . . , 2^ + fc — 1 — to, 

2/ + fc + 3 — TO, 2Z + fc + 5 — TO, . . . ,2^—1; 

, 0, otherwise. 



= 



®m+ii 

h'-l) 

.a-1) 

.a-1) 



+ Cl 

+ a 



2m — fc+2’ 

/ 

2m-2k-\-i ’ 



z = 21 + 1; 
z = 2l + fc+l — to; 
z = 2Z + 2fc + 1 — to; 

z even and z yf fc + 1, fc + 3, . . . , 2/ + 2fc — 1 — to; 
or i odd and z = 1, 3, . . . , 2/ + fc — 1 — to, 

2Z + fc + 3 — TO, 2Z + fc + 5 — TO, . . . , 2/ — 1; 



otherwise. 
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®m+i 


i = 21 + k + 3 — m,2l + k + 5 


—m, ... , 2Z + 1; 


®m+i “k ®2m— fc+i 


* = 1,3,... , 2Z + fc + 1 — m; 


+ ^m-k+i + ®2m,- 


-2k+i ^ ^ “t“ 1, Zb + 3, . . . , 2Z + 2k +1 — TTi‘. 


^m—k+i 


* = 2Z + 2fc + 3 — m, 2/ + 2fc + 5 




— m, ... , m — 1; 


< 


* = 0,2,... , Zc — 1; 


0 


* = 2Z + 3,2Z + 5,... ,771 — 2. 



When I = 



1, it follows from the above equations 



c,; = t. 



i^) 



®m+i 


* = Zc, Zc + 2, . . . ,77* 


^m+i ®2m — fc+i 


*=1,3,... , Zc — 2; 


+ ^m-k+i + ®2m-2fe+i 


* = Zc + 1, Zc + 3, . . . 


a'i + o!.^-k+i 


* = 2Zc, 2Zc + 2, . . . , 




* = 0, 2, . . . , Zc — 1. 



Rewrite the above equation as the following 




O = (®m+i + ®2m-fe+i) 

Cfc+i ^rn+k+i 

^k+i ^k+i (^m+i ^2m—k+i) 

^2k+i ^2k+i ^m+k+i 



* = 0,2,.. 


• ,Zc- 1; 


(3a) 


* = 1,3,.. 


• ,fc-2; 


(3b) 


* = 0,2,.. 


. , 77* — Zc — 2; 


(3c) 


* = 1,3,.. 


• ,fc-2; 


(3d) 


* = 0,2,.. 


. , 77 * — 2Zc — 1; 


(3e) 



Then it can be seen from (3b) and (3d) that some partial sums can be reused 
(indicated with the bracket). This will save ^ ^ ^ bit operations. The total 
number of bit operations required for the squaring operation can be counted 
from (3a-3e) and it is ^ . 



3.2 m Is Odd and 1 < fc < ^ Is Even 

The definitions of a' and are the same as these in the last subsection. We 
rewrite the equation (1) here for convenience. 



m— 1 m— 1 

^ ^ mod fix). 

i—0 i—0 

The terms tf^’s have their initial values = a', and we try to solve the final 

vainest- ^ = c^, i = 0, 1, . . . , m — 1. 

When Z = 0, 

m— 1 m— 1 

mod f{x) 

i—0 i—0 
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m— 1 

= X] mod f{x). 

i=0 



It follows 



AO) 



a'm+1, i = l,k + l; 
a', i even; 

0, i odd and i l,k + 1; 



Since both the newly generated terms are odd power ones, no bit addition is 
needed to obtain from i = 0,1, . . . ,m — 1. 

For / ^ 0, we have 



m — 1 

i^O 

m — 1 

X + x'") 

i^O 

m—1 

\ A^~^^ ryi _l_ n' -1- n' ^k+2l+l 

i=0 

It can be seen that two odd power terms are generated at the right side of the 
above equation. 

k 

When I runs through from 0 to ^ ~ value of 2Z + 1 runs through the 

odd numbers from 1 to fc — 1, and the value of k + 21 + 1 runs through the odd 
numbers from fc + 1 to 2fc — 1. Note that 2fc — 1 < m — 1. 

Then we have 



m—1 




2 = 0 



^m+2/+l 


, i — 2/ -t“ 1, fc -t“ 2/ -t“ 1; 






i even, or i odd and 






i yf 1, 3, . . . , 2/ -t“ 1, fc -t“ 1, fc -t“ 3, . 


. . , fc + 2Z - 


0, 


otherwise. 






i = 2/+ 1; 






i = fc + 2Z + 1; 




.0-1) 


i even, or i odd and 






2yfl,3,... ,2/-t-l,fc-t-l,fc-t-3,.. 


. , fc + 2; - ; 


0, 


otherwise. 




^m+2’ 


i = 1,3,... ,2^ + 1; 






2 = fc -t“ 1, fc -t“ 3, . . . , fc -f 2/ -t“ 1; 




a’i, 


2 = 0,2,... , m — 1; 




0, 


otherwise. 





When 1=^ — 1, from the equation (4) we have 
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^m+2’ 


i = 1 , 3 , . . 


• 1; 


/c+2’ 


i = fc + 1, 


fc “h 3, . . . , 2fc — 




f = 0,2,.. 


. ,m- 1; 


0 , 


otherwise. 





(5) 



In the following we consider two cases: 

1. If 2fc < m — 1. 

In this case we have k ^ m—k — 3. When I runs through from ^ to ™ ~2 ~ 
(in order to satisfy k + 21 + 1 <m — 1 ), the value of 2 / + 1 runs through 
the odd numbers from k + 1 to m — k — 2, and the value of fc + 2 ^ + 1 runs 
through the odd numbers from 2k + 1 to m — 2. 

From (4) and since 2/ + 1 ^ fc + 1, we have 

^ + ®m+ 2 Z+lJ * = 2 / + 1 ; 

®m+ 2 i+li i=k+2l + l] 

i even, or i odd and 

z = l,3,... ,2/ — l,2/-l-3,... -\- ‘ll — I 5 

0 , otherwise. 



tf i — 21 + V, 




i — fc -t“ 2/ -t“ I5 


f (^~1) 
^2 ’ 


* even, or i odd and 
i = l,3 , ... , 2 Z-l, 2 Z + 3 ,... ,fc + 2 / 


0, 


otherwise. 


^m+2’ 


i = 1, 3 , . . . , fc — 1; 


— fc+2’ 


z = 2 Z + 3 , 2 Z + 5 ,... ,fc + 2 / + l; 


^m+2 


-k+i ^ = fc “1” I5 fc “1” 3 , . . . , 2 / + 1 ; 


o', 


z = 0, 2, . . . , m — 1; 


0, 


otherwise. 



When I = — — j it follows 




= 



a: 



•^m+2 ’ 
^'m+i ^ 



m— fe+i’ 



• a, 






a-i, 

0 , 



i = 1, 3, . . . , fc — 1; 
i = k + 1, k + 3, . . . ,m — k — 2] 
i = m— k,m — k + 2 ,... ,m — 2; 
i = 0, 2, . . . , m — 1; 
otherwise. 



When — — ^ i ^ I < ^2 ^ ~ have 



m — 1 m— 1 

i^O 

m— 1 



2=0 



2=0 



126 Huapeng Wu 



m— 1 
z=0 



.2l-\-2k-\-l—m 






2l-\-k-\-l — m 



+®m+2i + l^ 

When I runs through from — k — 1 |.q m — 1 _ value of 2/ + 1 runs 

through from the odd numbers to — fc to to — 2, the value of 2/ + fc + 1 — to runs 
through the even numbers from 0 to A: — 2, and the value of 2Z + 2A: + 1 — to 
runs through the even numbers from fc to 2fc — 2. 

Therefore, we have 



= 



= 



'di-i) 

Ai-^) 

Ai-A 

.a-1) 



^m+2^+1’ ^ — 2/ “h 1, 2/ “h “h 1 — 77Z, 2/ -h 2,k -|- 1 — Ul\ 
otherwise. 

- Om+i! I = 2/ + 1; 

-a'2„_fc+i, i = 2/ + fc+l-TO; 

- «2m-2fe+i^ i = 2/ + 2/c + 1 - to; 

otherwise. 



= 



^m+2’ 


* = 1,3,... , fc — 1; 


^m+2 ^m—k+i 


* = A: -f 1, A: -f 3, . . . , 2^ -t- 1; 




* = 2/ -f 3, 2/ -f 5, . . . , TO — 2; 


^2 ^2m-A:+2’ 


* = 0,2,... , 2/ + A: + 1 — to; 


'A 0‘2m—2k-\-i-> 


i = k,k + 2, . . . , 2/ + 2A: + 1 — to; 


o'i, 


otherwise. 



Then we can solve the final values for this case: 








* = 1,3,... , A: — 1; 


^m+z ^rn-k-\-i 


* = A: + l,A: + 3,... ,to — 


^z “1” ^2m— fc+z5 


* = 0, 2, . . . , A: — 2; 


^z ^2m— 2 /c+z7 


i = k,k + 2, . . . ,2k — 2; 


a'i, 


i = 2A:, 2A: + 2, . . . , to — 1 



(6) 



From the above equation we conclude that the total cost for computing 
squaring operation for this case is ^ ~ ^ bit addition. The longest time 

delay to compute a Ci is the time taking to finish one bit addition. 

2. If 2fc = TO - 1. 

In this case we have 2fc — 1 = to — 2. Thus from (5) it follows 



= < a: 



* = 1,3,... , fc — 1; 



m+i’ -^5 '-'5 • 

'm-k+i^ f = fc + 1, fc + 3, . . . , TO - 2; 

Tj, * = 0, 2, . . . , TO — 1. 



Then for ^ ^ I ^ ^ 2 ^ ~ have 



m— 1 m—1 

sr ^ 

i=0 i=0 



^ mod /(x) 
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a. 



'm+2Z+l 



m—1 

i=0 

m—1 

\ r'^n' 



mod /(x) 



2 = 0 



_2/+fe+l — m 
^m+2Z+l‘^ 



-\-d. 



'm-\-2l-\-l^ 



2l-\-2k-\-l — 7Ti 



mod /(x) 



When I runs through from ^ to ^ 2 ^ ~ value of 2Z + 1 runs through 

from the odd numbers k + 1 to m — 2, the value of2Z + fc + l — to runs through 
the even numbers from 0 to fc — 2, and the value of 2/ + 2fc + 1 — to runs 
through the even numbers from k to 2k — 2. 

Therefore, we have 



= 



= 



W-I) 


+ ®m+2i+l) 


i = 21 + 1, 2Z + fc 


+ 1 — TO, 21 + 2/c 


.0-1) 




otherwise. 




.0-1) 


®m+ii 


i = 2l+l] 




.0-1) 


+ ®2m-fc+0 


i = 2l + k+ l — 


to; 


.0-1) 


+ ®2m-2fe+i 


, i = 2/ + 2/c + 1 


— to; 


.0-1) 




otherwise. 






i 


= 1,3,... ,fc- 1; 




^m+2 


+ a'„_fc+i * 


= fc “h 1, /n “h 3, . . 


,21 + 1; 


^m-k+i^ * 


= 2/ + 3,2/ + 5, . 


. . ,to-2; 


+ ®2m-/c+i) * 


II 

0 

to 

to 

+ 


+ 1 — to; 


+ ®2m-2fc+i! * 


= fc, fc + 2 , . . . ,21 


+ 2fc + 1 — to; 




otherwise. 





Then we can solve the final values for this case: 



Ci = t- 



(= 1 ^- 1 ) 



= 



®m+zi 


i = 1, 3, . . . , /c — 1; 


®m+z ^m—k+i 


i = lc + l,fc + 3,... ,to — 


+ ®2m-fc+0 


i = 0, 2, . . . , /c — 2; 


+ ®2m-2fc+i’ 


i = Ic, fc + 2, . . . ,2k — 2] 


a', 


i = 2k, 2k + 2, . . . , TO — 1 



( 7 ) 



From the above equation it is clear that the total cost for computing squaring 
operation for this case is also ^ ~ ^ bit addition. 



3.3 m Is Even and 1 ^ fc ^ ^ Is Odd 

When the field is generated with an irreducible trinomial of form /(x) = x™ + 
x^ + 1, where to is even and fc ^ ^ is odd, similar analysis can be applied. In 

this case the complexity for a PB squaring operation in F 2 m is ^ ~ ^ bit 

additions if /c < ^ , and ^ ^ ^ bit additions if fc = ^ [6] . 

We summarize the results obtained from the three cases in this section in the 
following theorem: 
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Theorem 1. If there is an irreducible polynomial f{x) = cc"* + a;^' + 1, 1 ^ A: ^ 
^ over F2, then a squaring operation in F2m can be performed in 



(i) 



m — 1 



bit additions, if both m and k are odd. 



(ii) ^ — - bit additions, if m + fc is odd and fc 7^ ™ 



(iii) bit additions, if fe = 



T- 



4 Bit-Parallel Implementation 

In hardware implementation, a bit addition in F2 can be realized using an XOR 
gate. If we denote the time propagation delay of an XOR gate by Tx, then the 
time delay of a hardware architecture can be measured in terms of gate delays. 

For example, from (3a-3d) it can be seen that the most bit operations taken 
to compute a Ci are when i = 1, 3 , . . . , fc — 2, as it is shown in (3d). Thus in this 
case the longest time propagation delay in a bit-parallel architecture for squaring 
is 2 Tx- The time delay for the other cases can be obtained from (2), (6), and 
(7) in a similar way. 

The results on the complexity for a bit-parallel implementation of squaring 
operation are summarized as follows: 

Theorem 2. If there is an irreducible polynomial f{x) = x"* -I- -I- 1, 1 ^ fc ^ 

^ over F2, then a bit-parallel hardware implementation of squaring operation 
in F2m can be constructed with 



^ 2 ^ XOR gates and the incurred time delay is 2 Tx , if both m and fc > 1 
are odd; 

^2 ^ XOR gates and the incurred time delay is Tx, if m is odd and fc = 1; 

^ ~^2 ~ ^ XOR gates and the incurred time delay is Tx, if m is odd and 
fc is even; 

^ T 2 ~ ^ XOR gates and the incurred time delay is 2 Tx , if m is even and 
1 < fc < ^ is odd. 

^ XOR gates and the incurred time delay is Tx, if m is even and fc = 1. 

(vi) ^ it ^ XOR gates and the incurred time delay is Tx, if m is even and 
fc - 



(i) 

(ii) 

(iii) 

(iv) 

(v) 



5 Concluding Remarks 



Squaring operation is frequently required in elliptic 
when an inversion or a point multiple operation 
has been widely used because squaring operation 
a cyclic shift of the coefficients. However, normal 
performed efficiently only when there is an optimal 
in this paper have shown that the complexity of 



curve cryptographic systems 
is performed. Normal basis 
using normal basis is only 
basis multiplication can be 
normal basis [5] . The results 
a PB squaring operation is 
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very low, comparing to that of a multiplication operation {0{rn^)). This fact 
suggests that polynomial basis might be a good replacement for normal basis 
in many cryptographic application, since the prevailing existence of irreducible 
trinomial [1], comparing to that of optimal normal basis. 
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Abstract. A metering scheme is a protocol in which an audit agency is 
able to measure the interaction between clients and servers on the web 
during a certain number of time frames. Naor and Pinkas [7] considered 
metering schemes in which any server is able to construct a proof to be 
sent to the audit agency if and only if it has been visited by at least 
a number, say h, of clients in a given time frame. In their schemes the 
parameter h is fixed and is the same for any server and any time frame. 
In this paper we introduce dynamic multi-threshold metering schemes, 
that are metering schemes in which there is a threshold associated to 
any server for any time frame. We mainly focus on the efficiency of dy- 
namic multi-threshold metering schemes, by minimizing the information 
received and distributed by clients. This is important because the clients 
participating in the metering process do not receive any money from the 
audit agency. 

Keywords: Metering Schemes, Security, Cryptography, Entropy. 



1 Introduction 

Most of the revenues of web sites come from advertisement payments. Web 
advertisers must have a way to measure the exposure of their ads by obtaining 
usage statistics about web sites which contain their ads. Indeed, the amount of 
money charged to display ads depends on the number of visits received by the 
web site. Consequently, advertisers should prevent the web sites from inflating 
the count of their visits in order to demand more money. Hence, there should 
be a mechanism which ensures the validity and accuracy of usage measurements 
against fraud attempts by servers (web sites) and clients (visitors). In a typical 
scenario there are many servers and clients, and an audit agency whose task 
is to measure the interaction between the servers and the clients. A system for 
measuring the amount of services performed by the servers is called metering 
scheme. 



D.R. Stinson and S. Tavares (Eds.): SAC 2000, LNCS 2012, pp. 130—144, 2001. 
(c) Springer-Verlag Berlin Heidelberg 2001 
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Naor and Pinkas [7] proposed metering schemes in which any server is able 
to present to the audit agency a short proof for the number of client visits it has 
received in a given time frame. In their schemes all servers are associated to a 
threshold h, and are able to compute their proofs for a certain time frame if and 
only if they have been visited by a number of clients larger than or equal to h 
in that time frame. The schemes proposed by Naor and Pinkas are also efficient: 
the task for the audit agency in sending information to clients and servers is very 
simple, as well as the task for the servers in computing their proofs. Recently, 
different kinds of metering schemes have been proposed. Metering schemes for 
ramp structures [1,3] have been introduced in order to reduce the overhead to 
the overall communication due to the metering process. Metering schemes with 
pricing [1,5] have been introduced in order to have a more flexible payment 
system. Finally, metering schemes for general access structures [6] have been 
introduced in order to measure the interaction between servers and particular 
groups of clients. 

In metering schemes considered by Naor and Pinkas [7] the parameter h is 
fixed and is the same for any server and any time frame. This is acceptable 
whenever there is a long-term relationship between the audit agency and the 
servers. In order to measure any number of visits in any granularity we introduce 
dynamic multi-threshold metering schemes, which are metering schemes in which 
there is a threshold h* associated to any server Sj for any time frame t. 

Dynamic multi-threshold metering schemes involve distributing information 
to clients and servers. Obviously, such information distribution affects the over- 
all communication complexity. Therefore, it is important to construct schemes 
whose overhead to the overall communication is as small as possible. We mainly 
focus on the efficiency of dynamic multi-threshold metering schemes, by min- 
imizing the information received and distributed by clients. This is important 
because the clients participating in the metering process do not receive any 
money from the audit agency. In this paper we provide lower bounds on the size 
of the information received and distributed by clients and we present a scheme 
achieving these lower bounds. 



2 The Model 



Consider the following scenario: there are n clients, m servers and an audit 
agency A which is interested in counting the client visits to the servers in r 
different time frames. For any i = 1, . . . , n and j = 1, . . . , m, we denote by Ci 
the z-th client and by Sj the j-th server. 

There is an initialization phase in which the audit agency A distributes some 
information to any client over a private channel. For any i = 1, . . . , n, we denote 
by Ci the information that the audit agency A gives to the client Ci. Moreover, 
we denote by Ci the set of all values that Ci can assume. Given a set of client 
indices Z = {!,...,«} C {!,..., n}, we denote by the cartesian product 

Cl X • • • X Ca- 
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At the beginning of any time frame the audit agency A distributes to any 
server a piece of information which depends on the identity of the server and 
on the time frame. For any j = 1, . . . , m and t = 1, . . . , r, we denote by s* the 
information that the audit agency A gives to the server Sj at the beginning of 
time frame t. Moreover, we denote by S* the set of all values that s* can assume. 
Given a set of server indices i? = {1, . . . , /3} C {1, . . . , to}, we denote by S* the 
cartesian product S{ x ■ ■ ■ x S^. 

A regular operation consists in a client visit to a server during a time frame. 
During such a visit the client gives to the visited server a piece of information 
which depends on its private information, on the identity of the server, and on 
the time frame during which the client visits the server. For any i = 1, . . . , n, 
j = 1, . . . , TO, and t = 1, . . . , r, we denote by c‘ j the information that the client 
Ci sends to the server Sj when visiting it in time frame t. Moreover, we denote by 
Cj the set of all values that d can assume. Given a set of server indices B = 
{l,...,/3j C {1,..., to}, we denote by the cartesian product G* ^ x • • • x 
Moreover, given a set of client indices Z = {1, . . . , a} C {1, . . . , n}, we denote 
by Gj ^ the cartesian product G} ^ x • • • x G^ For any j = 1, . . . , to and 
t = 1, . . . , T, we denote by X* ^ the set of the dj client visits received by server 
Sj in time frame t. 

During the proof computation stage any server Sj which has received at least 
h^j visits during time frame t is able to compute its proof for time frame t, as 
function of the information provided by the h* clients and the information s* 
provided by the audit agency A at the beginning of the time frame t. For any 
j = 1, . . . , TO and t = 1, . . . , r, we denote by p* the proof computed by the 
server Sj when it has been visited by at least distinct clients in time frame 
t. Moreover, we denote by the set of all values that p* can assume. Given a 
set of server indices B = {l,...,/3} C {1,..., m}, we denote by the cartesian 
product Pi X ■ ■ ■ X P^. 

During the proof verification stage the audit agency A verifies the proofs 
received by servers and decides on the amount of money to be paid to servers. 
If the proof received from a server at the end of a time frame is correct, then A 
pays the server for its services. 

A corrupt server can be assisted by corrupt clients and other corrupt servers 
in order to inflate the count of its visits. A corrupt client Ci can donate to 
a corrupt server the whole private information received by the audit agency 
during the initialization phase. We assume that the number of corrupt clients 
is c, where 1 < c < minj=i_..._m min(=i_,,,^.r A corrupt server can donate to 
another corrupt server the private information received from the audit agency 
at the beginning of any time frame in addition to the information received from 
clients in previous time frames and in the actual time frame. For any i = 1, . . . , n 
and t = 1, . . . ,r, we denote by Vj^^ all the information received by a corrupt 
server Sj in time frames l,...,t. This information includes the sets of client 
visits received by server Sj in time frames 1, . . . , t. We also define = 0, for 
any corrupt server Sj . We assume that the maximum number of corrupt servers 
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is s, where 1 < s < m. For the reader’s convenience, the notations used in this 
section are summarized in Appendix B. 

In this paper with a boldface capital letter, say X, we denote a random 
variable taking value on a set denoted by the corresponding capital letter X 
according to some probability distribution {Pr^(x)}xex- The values such a ran- 
dom variable can take are denoted by the corresponding lower letter. Given a 
random variable X we denote with Ff(X) the Shannon entropy of {Pr^(x)}xex 
(for some basic properties of entropy, consult the Appendix A). 

We formally define dynamic multi-threshold metering schemes by using the 
entropy approach, as done in [1,3, 5, 6]. We use the entropy approach mainly be- 
cause this leads to a compact and simple description of the schemes and because 
the entropy approach takes into account all probability distributions on the sets 
of the proofs computed by the servers. 

Definition 1. An (n, m, r, c, s, " dynamic multi-threshold metering 

scheme is a protocol to measure the interaction between n clients and m servers 
during r time frames in such a way that the following properties are satisfied: 

1. Any client is able to compute the information needed to visit any server in 
any time frame: 

Formally, it holds that iF(C- jCi) = 0 for i = 1, . . . , n, j = 1, . . . ,m, and 
t = l,...,r. 

2. Any server Sj which has received h* client visits during time frame t and the 
message provided by A at the beginning of the time frame t can compute its 
proof for t: 

Formally, it holds that iF(P* |X* = 0, for j = 1, . . . , to and t = 

l,...,r. 

3. Let us consider a coalition of a corrupt clients Ci,...,Ca and (3 corrupt 

servers 5i,...,5^, where 0 < a < c < min^^i ^ md 1 < 

P < s, and let B = {1, . . . , /3}. Assume that at some time frame t each server 
Sj in the coalition has been visited by less than h*—a clients and has received 
the information by A. Then, the servers in the coalition have no information 
on their proofs for t: 

Formally, it holds that if(P* |Ci . . . C„X* . X^ V|-il) = 
iF(P* ), where dj < hj — a, for j = 1, . . . ,/3. 

Notice that Naor and Pinkas [7] considered metering schemes which are 
“static” and with “single threshold”, i.e., where h* = h for j = 1,...,to and 
t = 1, . . . , r. Moreover, their schemes do not require communication between 
audit agency and servers at the beginning of any time frame. 



3 A Dynamic Multi— threshold Metering Protocol 

In this section we present a dynamic multi-threshold metering scheme which 
is optimal with respect to the bounds (4) and (5) presented in Section 4. The 
protocol is a generalization of Naor and Pinkas metering scheme [7]. 
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Initialization: For j = 1, . . . , m and t = 1, . . . , r, let hj be the threshold associ- 
ated to the server Sj in time frame t and let h = maxj=i_..,^m maxt^i^,...,- /i‘- -|- 1. 
The audit agency A chooses a random polynomial Q{x,y) of degree h — 1 in x 
and sr — 1 in y over GF(q)^ where q is a sufficiently large prime number. After- 
wards, A sends the univariate polynomial Q{i,y), which is of degree sr — 1, to 
each client Ci. 

Beginning of a Time Frame: At the beginning of time frame t, for any 
server Sj, the audit agency A evaluates the polynomial Q{x,j o t) in h — h* 
points other than and sends these values to Sj. The argument jot 

denotes the concatenation of j and t, and we assume for simplicity that j ot is in 
GF{q) and that no distinct two pairs (j, t) and are mapped to the same 

element. 

Regular Operation: When the client Ci visits the server Sj in time frame t, 
it sends the value Q{i,j ot) to Sj. 

Proof Generation and Verification: Assume that the server Sj has been 
visited by at least h* different clients in time frame t. Then, knowing the h — hj 
points of Q(x, jot) provided by the audit agency at the beginning of time frame t, 
the server can perform a Lagrange interpolation and reconstruct the polynomial 
Q{x,j°t)- Then, it can compute the value Q{0,jot), which constitutes the proof 
that the server sends to the audit agency. The audit agency can easily verify this 
value. 



3.1 Security of the Scheme 

In this section we prove that the scheme presented in Section 3 satisfies Proper- 
ties 1, 2, and 3 of Definition 1. 

It is immediate to verify that the scheme satisfies Property I of Definition 1. 
Indeed, for any i = 1, . . . ,n, the information given by the audit agency to the 
client Ci consists of the univariate polynomial Q{i, y) and for any j = 1, . . . , m 
and t = 1, . . . , T, the information given to the server Sj by client Ci in time frame 
t is obtained by evaluating the univariate polynomial Q{i, y) at j o t. 

It is also easy to verify that the scheme satisfies Property 2 of Definition 1. 
Assume that a server Sj has been visited by h* clients in time frame t and that it 
has received h—hj points of Q{x,j ot) from the audit agency at the beginning of 
time frame t. Therefore, the server Sj knows h points of the polynomial Q{x,jot) 
and can perform a Lagrange interpolation on it. Afterwards, it can compute its 
proof Q{0,j o t) by evaluating the polynomial Q{x,j o t) at the point 0. 

Finally, we prove that the scheme satisfies Property 3 of Definition 1. We 
consider the worst possible case in which c corrupt clients decide to cooperate 
with s corrupt servers at time frame r. Moreover we assume that the corrupt 
servers have collected the maximum possible information during the previous 
time frames 1, . . . , r — 1. In other words, we assume that each corrupt client Ci 
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gives its polynomial Q{i,y) to all servers in the coalition, and that any corrupt 
server Sj in the coalition knows the polynomial Q(x,jot) for t = — 1. 

In order to compute its proof Q(0, jor) for time frame r, any server Sj should 
be able to interpolate either the polynomial Q{x, Jot) or the bivariate polynomial 
Q{x, y). Notice that for any j,k € {1, . . . , s}, with j yf k, the information held 
by the server Sk is of no help in computing the polynomial Q{x,j o r). Assume 
gj = Kj—c—lhe the number of client visits received by server Sj during time 
frame t. Each corrupt client Ci donates to Sj the polynomial Q{i, y) from which 
Sj can compute the value Q{i,j o r). Since there are c corrupt clients, Sj can 
compute c values of Q{x, j o t) in addition to those provided by the gj visits 
performed by non corrupt clients. Since the server Sj has also received h — hj 
points of Q{x,j o r) by the audit agency at the beginning of time frame r, the 
overall number of points of Q{x, j o r) known to Sj is gj + c + h — Kj = h — 1. 
Therefore, the server obtains a linear system of h — 1 equations in h unknowns. 
For any choice of a value in GF{q), there is a polynomial R{x,j o r) which 
is consistent with the information held by the server. Since there are q such 
polynomials, the probability of the server in guessing its proof for time frame r 
is at most 1/q. 

Alternatively, the coalition of corrupt servers might try to interpolate the 
polynomial Q{x,y) in order to compute the proofs. The information that a cor- 
rupt client Ci gives to a corrupt server is equivalent to the sr coefficients of its 
polynomial Q{i,y). For j = 1, . . . ,s, the information collected by each corrupt 
server Sj at the beginning of time frame r is constituted by the information 
provided by the audit agency at the beginning of any time frame t = 1, . . . , r, 
which consists in h — h* coefficients of Q{x,j o t), in addition to the informa- 
tion provided by clients during each time frame t = l,...,r— 1, which consists 
in h*j coefficients of Q{x,j o t). Hence, at the beginning of time frame r each 
corrupt server holds (r — l)h coefficients of Q{x,y) and h — hj coefficients of 
Q{x,j or). Suppose that in time frame r each server Sj, j € {1, . . . , s}, receives 
9j ^ h'j — a — 1 regular visits from clients. Then, the overall information on 
Q{x,y) held by the coalition of corrupt servers and clients at the end of time 
frame r consists of 



points. The first term of (1) corresponds to the information donated by the c 
corrupt clients, the second term corresponds to the information collected by the 
s corrupt servers during time frames 1, . . . , r — 1, the third term corresponds to 
the information provided by the audit agency at the beginning of time frame 
T, the fourth term corresponds to the information provided by client visits at 
time frame r, and the last term corresponds to the information which has been 
counted twice. Since gj < h'j — a — I for j = 1, . . . , s, it is easy to see that 
expression (1) is less than or equal to hsr — s. Therefore, the servers obtain a 
system of at most hsr — s equations in hsr unknowns. For any choice of s values 
in GF{q), there is a polynomial R(x, y) which is consistent with the information 



S 



S 




( 1 ) 
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held by the servers in the coalition. Since there are 5 ® such polynomials, then 
the corrupt servers Si, . . . ,Sg have probability at most 1 /g® of guessing their 
proofs for time frame r. 



4 Lower Bounds on the Size 

of the Information Distributed to Clients and Servers 

Dynamic multi-threshold metering schemes involve distributing information to 
clients and servers. In this section we provide lower bounds on the size of the 
information received by clients from the audit agency and distributed by clients 
to servers in dynamic multi-threshold metering schemes. 

In order to prove our results we will resort to the two following technical lemmas. 

Lemma 2. Let A and E be two random variables sueh that i/(A|E) = 0. Then, 
for any two random variables F and G, it holds that 

i/(G|AEF) = iJ(G|EF). 

Proof. Consider the mutual information /(A; G|EF). From (12) of Appendix A 
it holds that 



H{A\F,F) - H{A\FFG) = H{G\FF) - if(G|AEF). 

From (13) of Appendix A we have that i7(A|EFG) < iJ(A|EF) < if(A|E). 
Since iJ(A|E) = 0, it follows that 

H{G\AFF) = iJ(G|EF). 

□ 



Lemma 3. Let E, F, and G be three random variables such that i/(G|EF) = 0 
and iJ(G|E) = H(G). Then, it holds that 

H{F\F) = H{G) + H{F\FG). 

Proof. Consider the mutual information /(F; G|E). From (12) of Appendix A it 
holds that 

H{F\F) - H{F\FG) = H{G\F) - H{G\FF). 

Since H{G\FF) = 0 and H{G\F) = H{G), then it follows that iJ(F|E) = 
H{G) + H(F\FG). □ 

The next lemma immediately follows from Definition 1. Recall that for any sets 
of client and server indices Z = {1, . . . , a} C {1, . . . , n} and B = {1, . . . , P} C 
{l,...,m}, respectively, we denote by ^ the information given by clients 
Cl, . . . ,Cq, to servers Si, . . . ,Sp during their visits in time frame t. 
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Lemma 4. Let M be an (n, m, r, c, s, " dynamic multi-threshold 

metering scheme. Let Z = be a set of client indices and let B = 

{1, . . . , /3} be a set of server indices. Then, for any time frame t = 1, . . . ,t, it 
holds that 

H{ClJC,)=0. 

Proof. We have that 



a (3 

(from (15) of Appendix A) 

3=1 

a. (3 

j|Ci) (from (13) of Appendix A) 

i=i 

= 0 (from Property 1 of Definition 1). 



□ 



The next lemma will be a useful tool to prove a lower bound on the size of the 
information distributed to servers from clients during their visits. 

Lemma 5. Let Ai be an (n, m, r, c, s, dynamic multi-threshold 

metering scheme. Let Si . . . ,S /3 be a coalition of j3 < s corrupt servers and 
let B = {1, . . . , ;9}. Let Ci be a client and for j = 1, . . . , /3 and t = 1, . . . ,t, let 
Xj be a set of visits from h* — 1 clients other than Ci to server Sj in time 

frame t. Then, for any t = 1, . . . ,t and i = 1, . . . ,n, it holds that 

...X^_(,*_,)S‘V^11) > i?(P^ 

Proof. Let C\, ... ,Ca be a coalition of a < c corrupt servers other than Ci . Let 
us consider the random variables E = Ci . . . C„X* ,,t ,1 ■ • ■ Xl ,.t ,iS* 

“ l,{h\—a — l) p,(/i,^— Q— 1) B 

vh-i] A = C* . . . C* , F = C* , and G = We have that 
H(Cl, . . . |Ci . . . C„C*3X‘ __,)SlV|-il) 

< Lf(q% . . . Cf |Ci . . . C„) (from (13) of Appendix A) 

= 0 (from Lemma 4) . 



Hence, A, E, and F verify the hypothesis of Lemma 2, and one has iJ(GlEF) = 
iJ(G|AEF), that is, 

H(Pl\Ci . . . G.G‘^X( . . . X^ )S‘V|-il) 



"0 

'it -XTi 



= • q,,Gi . . . G„G‘,X‘, 



< H{Pi\Cl, . . . 



(from (13) of Appendix A) 



= H(P‘|X 
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0 

<J2H(P‘IK (from (13) and (15) of Appendix A) 

i=i 

= 0 (from Property 2 of Definition 1). 



From Property 3 of Definition 1 we have that 

H{Pl\C, . . . . . .X^ V^il) = H{Pl). 



Hence, E = C, . . . C„X^ 



...X^ 



gtv[*-il F = C* , and G = 



— a — 1) ■ ' ' /3,(/i^ — cn— 1)^B '' B ’ 

verify the hypothesis of Lemma 3 and one has 7L(F|E) = H{G) +iL(F|EG), 
that is 

Hiqjc, . . . . . . x‘ v^-ii) 

= B(P‘ ) + H{Cl^ |Gi . . . V^ilp* ) 

> iF(P‘ ) (from (7) of Appendix A). (2) 



Moreover, A = G* ...G* , E = Gi . . . G„X* , * ,,...X‘ ,,S* 

and F = G*^ verify the hypothesis of Lemma 2 and one has i7(F|E) = 
iJ(F|AE), that is 

H{CIJC^ . . . C„X‘ V^il) 

= HiClJCl^ ■ • • C‘,,Ci . . . G„X‘ . . . X^ )S*V^il) 

< ^(C‘. |C% . • . V^il) 

(from (13) of Appendix A) 

= H{Cl, \K(K-^) ■ ■ • -pSlV^^l) (3) 



Therefore, the lemma follows from inequalities (3) and (2). 



The next corollary immediately follows from Lemma 5. It implicitly shows that 
the size of the information each client has to give out when visiting a server is 
lower bounded by the size of the proof the server could reconstruct. 

Gorollary 6. Let M be an (n, to, r, c, s, dynamic multi-threshold 

metering scheme. For any i = 1, . . . , n, j = 1, . . . , to, and t = 1, . . . , r, it holds 
that 

H{Cl) > B(P‘). 

If the proofs for the servers are uniformly chosen in a finite field F, that is 
H{Pj) = log |F| for any j = 1, . . . , m and t = 1, . . . , r, then from Corollary 6 
and from (6) of Appendix A it holds that 

log|C*,-|>log|F| 



(4) 
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for i = 1, . . . , n, j = 1, . . . , m, and t = 1, . . . , r. This bound is tight, as in 
Section 3 we have presented a protocol for an (n, m, r, c, s, '’m) dynamic 

multi-threshold metering scheme in which the clients distribute exactly this 
information to servers during their visits. 

In order to prove a lower bound on the size of the information distributed to 
clients we need the next lemma. 

Lemma 7. Let M be an (n, nr, r, c, s, dynamic multi-threshold 

metering scheme. Let S\ . . . ,Sfj he a coalition of (3 < s corrupt servers and 
let B = {1, . . . , P}. Let Z C {1, . . . , n} be a set of client indices. Then, it holds 
that 

T 

Proof. We have that 

T 

. . . c;, |CJ < ^ H{Cl^ |C,) (from (15) and (13) of Appendix A) 

t=i 

= 0 (from Lemma 4) . 

Therefore, applying Lemma 3 with F = ^ . . . and D = we get 

H(C,) = H{cl^ . . . c; j -h i7(c, |C] , . . . c; j 

> H{Cl g . . . CJ^) (from (7) of Appendix A) 

T 

= J HK , 3 . . . C‘-i) (from (14) of Appendix A) 

t^2 

>E^(Ct,|V^il). 

t=l 

The next lemma provides a lower bound on the size of the information dis- 
tributed to clients during the initialization phase in dynamic multi-threshold 
metering schemes. It states that the information that must be kept secret by 
clients grows linearly with the number of time frames and the size of the coali- 
tion of corrupt servers. 

Lemma 8. Let M be an {n,rn,T,c,s,{hjYfl}f'''''^.^) dynamic multi-threshold 
metering scheme. Let Si . . . ,S /3 be a coalition of P < s corrupt servers and 
let B = {1, . . . , /3}. For any i = 1, . . . ,n, it holds that 

T 

H{Cp>Y^HpPl). 

t=i 

Proof. Let Ci be a client and for j = !,...,/? and t = 1, . . . , r, let A* be 

a set of visits from /i* — 1 clients other than Ci to server Sj in time frame t. We 
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have that 

T 

H{Ci) > (from Lemma 7) 

t = l 

T 

>T.HKb\K(K-i) • ■ (from (13) of Appendix A) 

T 

> iL(P^) (from Lemma 5). 

t=i □ 

Notice that in Definition 1 we did not say anything on the entropies of random 
variables P‘ for j G and t G Our results apply to the 

general case of arbitrary entropies on proofs, but for clarity, we state the next 
corollary for the simpler case that 77(P‘() = i7(P‘^) for all ji,j 2 G {1, ■ ■ ■ ,m} 
and ti,t 2 G {1, . . . ,r}. We denote this common entropies by H{P). 

If the proof sequences of the corrupt servers are statistically independent, then 
the next corollary holds. 

Corollary 9. Let M. be an (?^, m, t, c, s, dynamic multi-threshold 

metering scheme and letSi, . . . ,Ss be the s corrupt servers. If the proof sequences 
of the s corrupt servers are statistically independent, then it holds that 

H{C^) > stH{P), 

for any i = 1, . . . , n. 

If the proofs for the servers are uniformly chosen in a finite field F, that is 
H{P) = log |i^|, then from Corollary 9 and from (6) of Appendix A it holds that 

logical > sr log |F|, (5) 

for any i = 1, . . . ,n. This bound is tight, as in Section 3 we have presented 
a protocol for an (n, to, r, c, s, dynamic multi-threshold metering 

schemes which distributes exactly this information to clients. 

5 Efficiency of the Scheme 

In this section we analyze the efficiency of the scheme presented in Section 

3. It is easy to see that the scheme meets the bounds (4) and (5) of Section 

4. Indeed, during the initialization phase each client Ci receives by the audit 
agency the polynomial Q{i,y), which is of degree sr — 1. Therefore, the size 
of the information distributed to any client is st log g and the bound (4) is 
tight. During a regular operation in a time frame t each client Ci gives the 
value Q{i,j o t) to the visited server Sj. Therefore, the size of the information 
distributed to any visited server is logg and the bound (5) is tight. Hence, our 
protocol is optimal both with respect to the size of the information distributed 
to clients and with respect to the size of information given to servers by clients. 
This is important otherwise the task of receiving and sending information would 
burden the clients, that are not interested in the metering process. 



Dynamic Multi-threshold Metering Schemes 141 

6 Conclusions and Open Problems 

In this paper we have introduced dynamic multi-threshold metering schemes. In 
these schemes the servers need to communicate with the audit agency at the 
beginning of any time frame. 

In this paper we have assumed that clients provide correct values when they 
visit servers. In a practical implementation of a metering scheme, some method 
of authentication should be used. However, the method of authentication used 
would be, in general, not dependent on the specific metering scheme and it could 
be incorporated as an additional feature, if desired. 

We have proved lower bounds on the size of the information distributed to 
clients and on the size of the information given from clients to servers during 
their visits. An interesting problem would be to provide lower bounds on the size 
of the information distributed to servers at the beginning of any time frame and 
to devise dynamic multi-threshold metering schemes in which this information 
is as small as possible. 
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A Information Theory Background 

In this Appendix we review the basic concepts of Information Theory used in 
our definitions and proofs. For a complete treatment of the subject the reader 
is advised to consult [2]. 

Given a probability distribution {Pr^(x)}xex on a set X, we define the 
entropy ^ of X, H(X), as 

= -J2 PrAx) log Pr^{x). 

xGX 



The entropy satisfies the following property 

0<Ef(X) <log|A|, (6) 

where H(X.) = 0 if and only if there exists xq G X such that PrAxo) = 1; 
whereas, H(X.) = log \X\ if and only if Pr^{x) = 1/|X| for all x G X. 

All logarithms in this paper are to the base 2. 



1 
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Given two sets X and Y and a joint probability distribution on their cartesian 
product, the conditional entropy _ff(X|Y), is defined as 

H{X\Y) = ~Y,Y1 Px {y)Pr{x\y) log Pr{x\y). 

y&Y x£X 



From the definition of conditional entropy it is easy to see that 

H{X\Y) > 0. 

The mutual information I(X;Y) between X and Y is defined by 

I(X-,Y) = H{X) - H(X\Y) 
and enjoys the following properties: 

/(X;Y)=/(Y;X) 

and /(X; Y) > 0, from which one gets 

H{X) > H(X\Y). 



( 7 ) 



( 8 ) 



(9) 



( 10 ) 



Given three sets X, Y, Z and a joint probability distribution on their cartesian 
product, the conditional mutual information /(X; Y|Z) between X and Y given 
Z is 



/(X;Y|Z) = iJ(X|Z) -i?(X|ZY) (11) 

and enjoys the following properties: 

/(X;Y|Z)=/(Y;X|Z) (12) 

and /(X;Y|Z) > 0. Since the conditional mutual information is always non 
negative we get 

H{X\Z) > H{X\ZY). (13) 

Given n + 1 sets Xi , . . . , X„, Y and a joint probability distribution on their 
cartesian product, the entropy of Xi . . . X„ given Y can be expressed as 

n 

H{X, . . . X„|Y) = H{X, |Y) + ^ i?(X,|Xi . . . X,_iY) (14) 

i=2 



and enjoys the following property: 



iJ(XiX2 . . . X„|Y) < ^ H{X,\Y). 

2 = 1 



( 15 ) 
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B Parameters and Variables Used in the Paper 



c 



n 



m 



T 




s 



number of clients 

number of servers 

number of time frames 

number of corrupt clients 

number of corrupt servers 

threshold for server Sj in time frame t 

information distributed to client Ci 



C* j visit from client Ci to server Sj in time frame t 

B = {1, . . . , /3} indices of corrupt servers, /3 < s 

visits from client Ci to servers 5i , . . . , 5^ in time frame t 
visits from dj clieirts to server Sj in time frame t 




Sj informatioir distributed to server Sj at the begiirniirg 

of time frame t 

information distributed to servers 5i, . . . , 5/3 at the beginning 
of time frame t 



P* proof for server Sj in time frame t 

proofs for servers 5i, . . . , 5/3 in time frame t 
informatioir collected by server Sj in time frames 1, ... ,t 
information collected by servers 5i, . . . , 5/3 



in time frames 1, . . . , t 
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Abstract. We present a protocol for the exchange of individually au- 
thenticated data streams among N parties. Our authentication procedure 
is fast, because it only requires the computation of hash functions - we 
do not need digital signatures, that are substantially less efficient. The 
authentication information is also short: two hash values for every block 
of data. Since there are no shared secrets, this information does not grow 
with N, the number of parties. 



1 Introduction 

Multicast applications are receiving increasing commercial interest. In particu- 
lar, multicast conferencing tools are available that support real-time multiway 
communications over a packet network. 

The security of multicast and videoconferencing has also become an impor- 
tant and specific issue [8,3,21], and it includes two forms of authentication ser- 
vices: 

— Group authentication. The conference participants share one key, and au- 
thentication allows to check that data has not been modified or inserted by 
attackers outside the group [21,24,3]. 

— Individual authentication. Received data streams must be authenticated with 
respect to their individual origin, i.e. one individual group member [8]. 

This paper addresses individual authentication. An authentication protocol 
will be obtained, that is secure even when attackers may adaptively choose au- 
thenticated data streams. For an interactive scenario, that is typical of multicast 
conferencing, the proposed solution is substantially more efficient than in previ- 
ous approaches. 

* The protocol described in this paper, and the timing protocol derived from it, were 
first presented in a seminar at IBM T. J. Watson Research Center in summer 1998 
by F. Bergadano 

** This work was completed while Bruno Crispo was with the Computer Science De- 
partment, University of Turin 
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2 Previous Work 

2.1 Individual Authentication in Multicast Groups 

Individual authentication may be essential in multicast applications. Yet, it is 
difficult to achieve. If one uses digital signatures, the required computational cost 
may be too high. In fact, end-user hardware may be limited and loaded by the 
multimedia processing that is part of the conferencing application. If one uses 
symmetric Message Authentication Codes (MACs) , every stream block will need 
to carry a distinguished code for every recipient. Previous work on individual 
authentication for multicast is all based on one of the above schemes (signature 
or multiple MACs), with modifications directed to improve efficiency. 

On the side of symmetric authentication, it is possible to use a fixed number 
K of MAC keys, that does not depend on the size N of the multicast group [8,10] . 
The sender knows K keys, and each receiver knows K/2 keys, chosen at random. 
Each stream block is authenticated with K MACs, corresponding to the K keys, 
and receivers check the MACs for which they have a key. Obviously, receiver 
collusions can lead to forged individual authentication codes. 

More work is available on the side of efficient digital signatures. Online/offiine 
signatures [11] may be used to split the computation in a more expensive off- 
line phase, followed by an efficient online phase that is needed when the data 
becomes available. On the other hand, part of the signature can be performed 
by a “signature server”, that may be located elsewhere, without compromising 
authentication or even non-repudiation [2]. One-time signatures [18,7] are an 
efficient alternative, and are adequate for stream authentication. Gennaro and 
Rohatgi [12] have proposed an efficient stream signing method that is based on 
a chain of one-time signatures. The disadvantage of this approach lies in the 
length of one-time signatures. 

2.2 Hash Chains 

Our work does not fall within the two above categories. In particular, it is not 
based on asymmetric cryptography, and there are no shared secrets. We use 
single individual MACs, and a hash chain of individual, secret keys. Hash chains 
have been used for a long time and for a number of different purposes. 

Hash chains were first proposed by Lamport [17] and then used in the S/Key 
[15] user identification system. They have been applied to the authentication of 
public key certificate revocation/ validity messages [20], to digital payment sys- 
tems [23], and to Web server hit acknowledgements. All of the above applications 
basically use hash chains to send periodic “yes. I’m alive” messages in a secure 
way. In other words, they are contentless, meaning that content is bound to the 
chain at the start, but is not added or modified when the individual hash values 
are released. 

This is not the case with more recent approaches, where hash chains have 
been proposed for the authentication of messages, as described in [2,1]. The basic 
idea of [1] is to MAC each block of data with a new key, and send the key as part 



146 



Francesco Bergadano, Davide Cavagnino, and Bruno Crispo 



of the next message. The main problem with that technique is that when a key 
is sent too early (i.e. before the previous message was obtained by all intended 
recipients), falsifying the rest of the stream becomes possible. 

A time protocol based on hash chains was developed independently by [9] 
and by [5]. This protocol was also published in [22] and [6]. 

In this paper we present an interactive protocol that does not depend on 
time, and provide proofs of its security. 

3 Chained Stream Authentication 

Following the formalization of [13], and [12], we define a security parameter, n, 
and say that a function e(n) is “negligible”, if, for all constants c, there is uq 
such that, for n > no, e(n) < l/n“. 

Following [13], we define a signature scheme as a triple (G,Sig,V) of proba- 
bilistic polynomial time algorithms, where (1) G is used to generate a key pair 
(SK,PK), (2) Sig is used to sign any message M, using the secret key SK, and (3) 
V is the signature verification algorithm, such that V(PK,M,Sig(SK,M))=l, for 
any message M. We will use a signature scheme that is secure against adaptively 
chosen message attacks [13]: the probability of forging a signature is negligible, 
even when a signature oracle is available. 

Similarly, we define a stream authentication scheme as a triple of probabilistic 
polynomial-time algorithms (GA,AA,VA), where 

— On input 1”, GA outputs a pair of keys (SK,PK)s {0, 1}^”. 

— AA is the authentication algorithm, and receives in input a secret key SK, 
and a stream S = S\, S 2 , ■■■■, Si, consisting of a finite number i of blocks. 
AA outputs an authenticated stream S' = 5^, ..., 5', where S'' = {Sj, authj), 
being authj some kind of authentication data. 

— The verification algorithm VA is such that VA(PK,AA(SK,S))=1. When 
VA(PK,S')=1, we will say that S' is valid. 

We may now define our proposed scheme, called a chained stream authenti- 
cation scheme: 

— As a generator GA, we use the generator of a signature scheme (G,Sig,V), 
secure against chosen message attacks. 

— The authentication algorithm AA will be called a “Ghained Stream Au- 
thentication” algorithm (GSA). This algorithm first generates a secret a, 
computes h^{a) for some k > i, and then produces the following output: 

s; = Si, MACi,k-.i^^){Si),h'^{a),SN,Sig{SK,h^{a),SN) 

S' = S 2 , MACn>^-.^^){S2),h^-\a) 

S' = S„ 

By MAG, we denote a secure Message Authentication Gode, i.e. such that 
the probability of forging a valid code is negligible, even when a MAG or- 
acle is available. For h, we will use a collision resistant hash function. The 
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authentication includes a session number SN that is incremented for every 
new stream. 

~ The verification algorithm VA will ouput 1 if the initial asymmetric signature 
is valid, if all the MACs are correct, and if the hash chain of the MAC keys 
is consistent, i.e. the hash of a key produces the previous key in the chain. 

4 Security against Continuations 

What are the security properties of the proposed scheme? Clearly, given a stream 
authentication oracle, it is possible to forge new valid authenticated streams, be- 
cause the MAC keys become known. Therefore, CSA is not “secure” according to 
the definition of [12], that can be rephrased as follows: “a stream authentication 
scheme (GA,AA,VA) is secure if any probabilistic polynomial-time algorithm F, 
given as input the public key PK and adaptively chosen authenticated streams 
outputs a new valid authenticated stream S' ^ for all j, only with 

negligible probability” . Clearly, this definition of security does not apply to the 
CSA scheme: the forger F may ask for just one authenticated stream, change any 
block but the last, and recompute the corresponding MAC using the available 
key. 

However, our scheme satisfies a weaker security notion, which we will call 
“security against continuations”. We define continuations as follows: 
Definition. A stream S 2 is a continuation of a stream Si, denoted by S\ C S 2 , 
if Si is a proper prefix of S 2 ■ 

The same definition applies to authenticated streams under (GA,AA,VA). A 
valid authenticated continuation S '2 of an authenticated stream S'] must then be 
such that S'l C S '2 and V A{PK, S 2 ) = 1. Security against continuations means, 
informally, that it is unfeasible to produce valid continuations of observed valid 
streams. This weaker notion will nevertheless be sufficient for building secure 
authentication protocols, after some means of sender/receiver syncronization 
is achieved, as described in Sections 5 and 6. More precisely, security against 
continuations corresponds to the following: 

Intuition. A forger may produce a new valid stream S only if it is associated to 
a stream T, that was used previously. Moreover, if \S\ = |T|, and the last blocks 
of S and T are different, then it will be impossible to produce a valid continuation 
ofS. 

Next, we formalize the above intuition and prove that it applies to CSA (in 
Lemma 1, with proof given in Appendix A). 

Definition. A stream authentication scheme is secure against continua- 
tions if there is no polynomial time algorithm F that, given adaptively cho- 
sen authenticated streams S'^^\ ..., S'^^\ is able to generate a valid authenticated 
stream S' = S'], ..., S' with non-negligible probability, unless 3i G [1, k] such that 

Sf^ = Sf\ MAC{, H, SN, sig{H,SN), where S] = Si, MACi, H, SN, 
sig{H, SN), and one of the following holds: 

1. j < |S'^*^|, or 
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2. j = |5'W|, and S'' = sf \ or 

3. j = |S'^®^|, and S' ^ there is no polynomial time algorithm F’ that 

can generate with non-negligible probability a valid continuation T' D S', 
given S'^^\ possibly other adaptively chosen authenticated streams, 

and any valid authenticated continuation of S'*-*^ . 

Security against continuations is a complicated notion, but it will lead us to 
a simple concept of stream authentication in Theorem 1. First, though, we need 
the following: 

Lemma 1. Suppose that, in the CSA authentication scheme, 

(1) (G,S,V) is a secure signature scheme, 

(2) g is a pseudorandom function, 

(3) h{x) = 3a;(0) and MACk(x) = gfc(l,x) 

(4) g is such that h is a collision resistant hash function. 

Then, the scheme ( GA, GSA, VA ) is secure against continuations. 

MAC and h are of the CSA algorithm, and are defined through a pseudoran- 
dom function [14,4] as defined in (3) above, because not only should the MAC 
be secure, but each key k must look random even though h{k) is known. With 
the definition of (3), knowing h{k) = gk(0) gives no additional information, as 
one could in any case query the oracle for g^ and obtain 5fc(0). In practice, one 
could use g=HMAC [16] so as to satisfy both (2) and (4). 

5 The Chained Stream Authentication Protocol 
with One Sender and One Receiver 

We will now use the CSA scheme to authenticate information over an inse- 
cure network. In fact, CSA’s security against continuations can be used with a 
synchronization mechanism to obtain a very efficient individual authentication 
method. For now, we consider one party, named A, who will send authenticated 
data, and one party, named B, who will receive the data. The protocol is de- 
fined below, where SigA{x) = Sig{SKA,x) is A’s signature under (G,S,V), and 
similarly Sigsix) = Sig{SKB,x) for B: 

1. B ^ A: h^{P),SN,SigB{h'^W),SN) 

A ^ B: Ai,MACh>^-^^){Ai),h^{a),SN, SigA{h^{a),SN) 

2. B ^ A: 

A — ^ B: A 2 , (A 2 ) , 

i. B ^ A: 



Messages are sequential: A will not send message i if it has not received a 
correct i-th message from B, and B will not send message f -I- 1 if it has not 
received from A a correct i-th message. A and B initially generate individual 
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random secrets a and /3, and compute h^{a) and h^{P), respectively. These 
values, and the session number SN, are signed, and exchanged as part of the 
first messages in step 1. Then, A sends data as defined in the CSA scheme, and 
B sends back authenticated acknowledgments. B’s authenticated ack for A’s j-th. 
message is simply h^~\j3). The receiver side is then similar to what happens in 
S/Key and similar applications [15,17]. The security properties of the protocol, 
to be discussed next, are made relative to the following: 

Active Attack Model. CSA sender A, CSA receiver B, and attacker E: 

— E runs in polynomial time and may ask A to send B the authenticated streams 

in sessions 1, k; 

— A chooses stream and sends it to B in session k + 1; 

— At any time, E can read messages, stop messages and insert messages; 

— During session k ~\~ 1, E tries to have B receive S' yf and believe it 

authentic. 

We call the above an active stream authentication attack. We shall prove 
in Theorem 1 that such attacks are not feasible with the above CSA protocol, 
except for the possible falsification of the last block of We first note that 

session numberings by sender and receiver are consistent: 

Observation 1. Suppose A has sent its first message of session SN . Then B 
must have already sent its first message of session SN. 

Observation 2. Suppose B has sent its second message of session SN. Then A 
must have already sent its first message of session SN. 

The observations allow us to speak of a “current session” in the CSA protocol 
with one sender and one receiver. We can now prove that this protocol represents 
a valid authentication mechanism: 

Theorem 1 (Security of CSA with one sender and one receiver). Suppose 
sender A and receiver B run SN sessions under the CSA protocol, where: 

— the conditions of Lemma 1 hold; 

— a polynomial active attacker E chooses streams ..., that A 

sends to B in sessions 1, ..., SN — 1; 

— B has received the valid authenticated stream S' = S'],..., S'' during session 
SN. 

Then, E can cause S],...,S '_2 to he non-authentic only with negligible probabil- 
ity. 

The proof is by induction on jS'j, based on Lemma 1 (see Appendix A). 
The last block of S' may be modified by E. Thus, authentication is obtained 
with a one block delay: the receiver can ascertain the origin and the integrity of 
a block in the stream only after receiving the next block. This is acceptable in 
most multicast applications. This is achieved without shared secrets and without 
signatures after the first message. The important consequences of this fact are 
discussed in the next session, where the protocol is used with more than just 
two parties. 
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6 The N-Party CSA Protocol 

We will now extend the protocol so that it can be used effectively in a multicast 
conferencing scenario. As a first step, we will consider two parties, A and B, 
that must exchange authenticated data in both directions. Obviously, this can 
be done by simply applying the CSA protocol with sender A and receiver B, and 
simultaneously with sender B and receiver A. However, we can make this more 
efficient by merging the hash values sent as acknowledgments and the ones sent 
as hashes of secrets. This results in the following two-party protocol: 

1. B ^ A: B^,MAC,,u^^^0){B^),h^{|3),SN,SigB{h^W).SN) 

A — B: Ai, MAC'/ifc-i(„)(Ai), h^(o!), S'W, SigA{h'^{ct), SN) 

2. B ^ A: H2,MACft.-2(^)(B2),h'=-i(/3) 

A — ^ B: A 2 , Af (A 2 ) , 

i. B ^ A: 

A — ^ B: Ai, M 



The above protocol may cause practical transmission difficulties. In particular, 
each party may send a block of data only after receiving a corresponding block 
from the other party. This causes a kind of stop-and-wait behaviour that implies 
poor network utilization and may result in unacceptable delays for real-time 
traffic. Fortunately, such strict sequentialization of messages is not necessary. In 
particular, data blocks and MACs can be sent at any time - only the delivery 
of keys need be delayed until acknowledgements are obtained and verified. We 
may therefore rewrite the above two party protocol by splitting the behaviour 
of each party into a data sender process and a key sender process, that run 
independently. For party A, the processes are defined as follows (party B is 
defined symmetrically) : 

A’s data sender process: 

1. send to B: MAC/ifc-i(Q,)(Ai), Ai 

2. send to B: MAC/ifc- 2 („)(A 2 ), A 2 

A’s key sender process: 

1. wait for MAC^k-i 

send to B: {a) , S N , sig A{h^ {a) , S N) 

2. wait for MAC'/jfc- 2 (^)(i? 2 ) 

wait for SN, sigsih'^iP), SN) 

send to B: 

3. wait for MAChk-3(^jj'^{B^) 
wait for 

send to B: 
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Delaying just secrets, not information, is essential in multicast conferencing: the 
receiver will continue viewing the stream of data even though the keys necessary 
to authenticate it are not yet available. When secrets are late, viewing is ahead 
of authentication, and we call this an authentication delay. The delay would be 
small and roughly equivalent to three times the network latency. The reason is 
that a MAC must be sent, then the authenticated acknowledgement is returned, 
and finally the MAC key is sent. Only then can the corresponding block be 
authenticated by the receiver. 

We may now generalize the above construction and obtain the N-party 
protocol. During session SN, party i first generates a random secret Oi and 
computes h^{ai). Party i consists of two processes, a data sender and a key 
sender, that run concurrently as described below, where is the jth block 
sent by party i: 

Data sender v. 

1. multicast and Ai^i; 

2. multicast MAC'/,fc- 2 (Q,.)(Ai^ 2 ) and Ai^ 2 ', 

Key sender i: 

1. wait for MAChk-i(^aj){Aj,i), for all j G 

multicast SN, i, sigi{h^ {ai) , SN, i); 

2. wait for for all j G [1, 

wait for h^{aj),SN,j,sigj{h^{aj),SN,j), for all j G 
multicast 

3. wait for MACf,k-3(^aj){Aj,z), for all j G 
wait for h^~^{aj), for all j G [1, 
multicast 



It is important to note that, for every block, the only authentication information 
that is multicast is one MAC (sent by the data sender) and one hash value (sent 
by the key sender). This does not grow with N. 

Conclusions 

We have defined a Chained Stream Authentication scheme and proved it to be 
secure against non-authentic stream continuations. This property was the basis 
for an interactive stream authentication protocol that was proven to be secure 
against active attacks, even when the attacker may ask for the authentication 
of a number of adaptively chosen streams. The protocol was then optimized for 
the case of a bidirectional flow of data. 

However, the importance of the protocol arises when there are more than 
just two communicating parties. Since there are no shared secrets, the size of 
the authentication data does not grow linearly with the number of parties. In 
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the optimized protocol for N parties, we need just one MAC and one hash value 
for every block of data. By contrast, symmetric individual authentication would 
require N distinguished MACs for every block. Therefore, the CSA protocol 
represents a major improvement. If compared with techniques based on the dig- 
ital signature of each block, CSA is to be preferred because of its much higher 
efficiency. 

With respect to the work in [12], our scheme has the advantage that if a 
packet is lost, using a timeout when waiting for all the MACs and keys, the 
authentication may proceed for the following blocks due to the use of a chain of 
keys, while in [12] a packet may be verified only if all the preceding packets have 
been received. Moreover, the online solution in [12] uses one-time signatures, 
that introduce a communication overhead of the order of 1000 bytes per packet. 
The CSA solution is an order of magnitude more efficient than [12] with respect 
to the authentication information transmitted by the sender. Another difference 
with our work is that [12] allows non-repudiation, while our scheme does not. 



Appendix A (Proof of Lemma 1 and Theorem 1) 

Proof of Lemma 1. Suppose that a forger F exists that can produce a valid 
authenticated stream S' with a non-negligible probability e, contraddicting the 
thesis. Then, one of the two following cases must hold, and at least one must 
hold with probability at least e/2: 

Case 1 S'l = Si,MACi,H, SN, sig{H, SN) and there is no such that 

= Si'\mACi\h, SN, sig{H, SN). Then, we can use the forger F to con- 
struct algorithm FI that breaks the asymmetric signature scheme (G,S,V). The 
constructed algorithm FI has access to an oracle for S, and starts by calling F 
as a subroutine. When F requires an authenticated stream S'^'\ FI generates a 
random secret computes and asks the oracle for sig{h^ , SN). 

Then, knowing a^’'\ FI authenticates the rest of the stream as required by CSA, 
and outputs the authenticated stream for F. The process continues until, 
with probability at least e/2, F outputs the new valid authenticated stream S' . 
In this case (Case 1), S' must be such that S'l = S\,MACi,H, SN, sig{H, SN) 
and there is no generated by FI such that 

= S'}'\ M AC^\ H , S N , sig{H , S N) . This means that a signature sig{H, 
SN) was never queried by FI to the oracle. Hence FI breaks (S,G,V) by out- 
putting the new valid signature H, SN, sig{H, SN). 

Case 2 S[ = Si,MACi,H,SN,sig{H,SN) and there is 5"^®^ such that 
S'}''^ = S'i'\ MACi\ H, SN, sig{H, SN). Then there are three cases, and one 
must hold with probability at least e/6: 

Case 2.1 IS"] < |5"^®^|. This case does not contraddict the Lemma. 

Case 2.2 |S"| > |S"(®^|. In this case we can construct F2 that can invert h, using 
the forger F, and hence break g. The inverter F2 is given a value a and must 
compute h~^{a) with non-negligible probability. F2 calls GA to generate a key 
pair (SK,PK), and then runs F as a subroutine. Let SN^ax be the maximum 
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number of authenticated streams that F will ask for. Clearly SN^ax must be 
polynomially large. F2 will then pick a random number R between 1 and SN^ax- 
When asked to authenticate stream F2 lets = a and then follows 

the CSA authentication algorithm, but choses k = When F stops, it 

will output S' such that |5"| > with probability greater than e/6, and 

i = R with probability 1/S'A^max- Hence, with probability greater than e/(6 * 
SNmax). = (5, A/AC, = 

{S,MAC,h~^{a)). Since h{x) = gx(fi), inverting h means breaking gkey after 
observing gkey(0). 

Case 2.3 IS"! = = j. There are two cases, and one must hold with 

probability at least e/12: 

Case 2.3.1 Sj = This case does not contraddict the Lemma. 

Case 2.3.2 Sj yf Let 5' = Sj,MACkey{Sj),h^~^^^{a). There are two 
cases, and at least one must occur with probability at least e/24: 

Case 2. 3. 2.1 MACkey(Sj), generated by F, and MAC{Sj'^), given in stream 
are valid under the same key. In this case we can use F to break guk, for 
some unknown key uk, in a polynomial algorithm F3, that has access to an oracle 
for guk- Define again SN-max as the maximum number of authenticated streams 
that F will ask for. F3 will then pick a random number R between 1 and SNmax- 
F3 will run F as a subroutine, will use GA to generate a key pair (SK,PK), and 
will authenticate all streams requested by F normally using CSA, except for 
stream For this stream, define I = and let = uk, and k = 1. 

Then, h{uk). F3 then computes {uk), ..., h{uk), after query- 

ing the oracle for h{uk) = gufc(O), and uses these values, in this order, to com- 
pute the MACs for S[^\ ..., S^^l, as required in CSA. As the last authenticated 
block, F3 outputs = {s\^\MACuk{s\^'^),h{uk)), where MACuk{S\^^) = 
5 „fc(l, 5'j^^), is queried to the oracle. With probability l/SN^ax, the authenti- 
cated stream S' output by F is associated to stream S^^\ i.e., z = i?. In this case, 
MACuk{Sj^^) and MACkey{Sj), are valid under the same key, i.e., key = uk. F3 

then ouputs MACuk{Sj) = guk{^,Sj), and since Sj yf S^'\ this is a new forged 
MAC, and a correct value of guk for the new input (1, 5y), that is generated with 
non-negligible probability greater than e/(24 * SNmax)- 

Case 2. 3. 2. 2 MACkey(Sj), generated by F, and MAC{S^'^), given in stream 
S'^'\ are not valid under the same key. We show that, in this case, there is no 
polynomial time algorithm F’ that can generate with non-negligible probability 
a valid authenticated continuation T' D S', given S'^^\ ...,S'^^\ possibly other 
adaptively chosen authenticated streams and any valid continu- 

ation of S'^''\ Suppose such a forger F’ exists, and does the above with non- 
negligible probability e' . Let = {Tjj_i,MAC, key). Since T' is valid and is a 
continuation of S', we also know that Tj = S'j = {Sj, MACkey{Sj), 

We construct F4 that can generate collisions for h with non-negligible probabil- 
ity, using F and F’. F4 starts by generating a key pair {SK, PK). Then, it runs 
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F as a subroutine and authenticates the requested streams normally using CSA. 
F will then output S' satisfying the conditions of this case. We also know that a 
continuation forger F’ exists and therefore S' has a possible valid continuation. 
Consequently MACkey(Sj) is a valid MAC for some value of key, and for the 
conditions in this case, key yf {a). This all happens with probability greater 
than e/24. In order to obtain key, and thus obtain a collision for h, F4 must 
then run F’ as a subroutine. F4 will authenticate additional streams asked by F’ 
normally, using CSA, and will then produce a continuation of , as requested 
by F’, also using CSA. When F’ outputs a valid continuation T' of S' , this must 
include the value of key, and F4 outputs {key, h^~^{a)) as a collision for h. This 
must occur with non-negligible probability (greater than e * e'/24). QED 

Proof of Observation 1: We construct an algorithm F that simulates A and 
B over an insecure network, where E can perform active attacks. F controls 
the simulations of A and B out of band, i.e. over a secure, separate channel. 
Suppose that, with non-negligible probability, E has taken action so that B has 
not yet sent the first message of session SN, when A has already sent its first 
message of session SN. Then we construct F so that it can forge signatures 
under the asymmetric scheme (G,S,V). F simulates A normally, by first calling 
GA to generate a key pair {SKa, PKa)- For simulating B, F does not generate 
a key pair, but relies on the oracle for the signature required in the first message 
of every session. At some point, F’s simulation of A must have computed the 
first message of session SN, and it must therefore have received sigsiH, SN). 
However, we have supposed F’s simulation of B has not yet computed the first 
message of session SN. As a consequence, sigsiH, SN) was never queried to the 
oracle, and can be output as a forged signature. QED. 

Proof of Observation 2: Under the same setting of Observation 1, F simu- 
lates B normally, by first calling GA to generate a key pair {SKb,PKb)- For 
simulating A, F does not generate a key pair, but relies on the oracle for the 
signature required in the first message of every session. At some point F’s sim- 
ulation of B must have sent the second message of session SN , and therefore 
it must have received sigA{H, SN). However, since F’s simulation of A has not 
yet begun running session SN, it has not yet computed the first authenticated 
block. As a consequence, sigA{H, SN) was never queried to the oracle, and can 
be output as a forged signature. QED. 

We shall now prove Theorem 1 by induction on |S"|, based on Lemma 1. 
However, we first need the following, that characterizes the information available 
to an active attacker at any given moment: 

Lemma 2. Suppose A and B run session SN, where: 

(1) (G,S,V) is a secure signature scheme, and 

(2) h is a one-way hash function 

Suppose also that B has received, during session SN, the valid authenticated 
stream S' = S'^, ...,S'^_i, and no more. Then, there is no active attacker E that 
can, with non-negligible probability, cause A to release more than j authenticated 
blocks during session SN. 
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Proof. Let parties A and B run session SN of the CSA protocol with one sender 
and one receiver. Suppose the Lemma is false. Then there is an active attacker 
E that can, with non-negligible probability, cause a situation where A has re- 
leased the stream A' = ..., while B has only received j — 1 blocks 

5'^, ..., Since A has released j + l blocks, it must have received authenti- 
cated acknowledgements {(3). There are two cases, and one must 

hold with probability at least e/2: 

Case 1: /3 the secret value chosen by B for session SN. Then, we show 

that we can construct an algorithm FI that can simulate A and B, and use a 
signature oracle to forge signatures under (G,S,V). FI simulates A by running 
the sender side of the CSA protocol. FI simulates B by running the receiver side 
of the CSA, but uses the signature oracle to produce the signatures needed in 
the first authenticated block of each session. When E has caused the situation 
covered by this case, FI’s simulation of A must have received sig{f), SN), and 
since P yf P^^^\ this can be output as a new forged signature. 

Case 2: P = P^^^\ Then, we can construct F2 that can compute h ^(a;), for 
any x, with non-negligible probability. F2 will simulate A and B, running the 
CSA protocol over a network where E can perform active attacks. F2 simulates 
A by running the sender side of the CSA protocol. F2 simulates B by running 
the receiver side of the CSA, but first sets the following values: 

— pick SN at random between 1 and SN^ax, where SN^ax is the maximum 
number of sessions that the attacker is able to cover; 

— pick j at random between 1 and jmax, where jmax is the maximum number 
of blocks per session in E’s active attacks; 

— let fe = j — 1 and = N{x); 

B is then able to send to A all acknowledgments required for receiving the 
first j — 1 blocks, i.e., (^p(SN)'j _ ^hen E, 

has caused the situation covered by this case, F2’s simulation of A must have 
received x). This happens with non-negligible probability 

^/‘^jniaxS Njy^^o^x- QED. 

Proof of Theorem 1: We prove a stronger claim by induction on |S"| = j, 
namely: under the conditions of the Theorem, the thesis holds and, if S) is non- 
authentic, then E can cause B to receive a valid continuation of S' only with 
negligible probability. 

Base: We show that the claim is true for j = 1. Before B has received from A 
the first message of session SN, by Lemma 2, A has released no more than the 
first block of session SN. Therefore, the information available to E consists of: 

— the previous authenticated streams ..., sent by A, and 

— the first block of session SN. 

Let S'l = S\,MACi,H,SN,sig{H,SN). Since S’ is valid, by Lemma 1, there is 
q € [1,5'A^] such that = s['^\ M AC['^\ H, SN, sig{H, SN). Since sequence 
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number SN is only used in clearly the only value for q is SN. Also by 

Lemma 1, if S[ ^ then there is no polynomial algorithm that can gen- 

erate a valid authenticated continuation of S', given adaptively chosen streams, 
and any valid continuation of A! . So neither can E, even after obtaining 
Inductive step: Suppose that the inductive claim is true for j — 1. We prove 
that it is also true for j. In order to do so, suppose B has received the valid stream 
S" = (S'^j ..., S''). This means that it was possible to produce a valid continuation 
of {S'l , ..., S'_]^), and, by the inductive hypothesis. Si, ..., Sj_i must be authentic 
with probability 1 — 77, where 77 is negligible. We now have to prove that, if Sj 
is non-authentic, then E can cause B to receive a valid continuation of S' only 
with negligible probability. By Lemma 2, before B’s receipt of S', A has released 
at most j blocks during session SN. Therefore, the information available to E 
before B’s receipt of S' consists of: 

— the previous authenticated streams A'^^\ sent by A, and 

— the stream ..., sent by A during session SN. 

Let Si = Si,MACi,H, SN, sig{H, SN). Since S' is valid, by Lemma 1, there is 
q G [1,SA^] such that S'l'^'^ = S[‘'\ M AC[''\ H,SN, sig{H, SN). Since sequence 
number SN is only used in clearly the only value for q is SN. Also by 

Lemma 1, if S' 7 ^ then there is no polynomial algorithm that can gen- 

erate a valid authenticated continuation of S', given adaptively chosen streams, 
and any valid continuation of So neither can E, even after obtaining 

QED. 
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Abstract. The paper describes a novel application of a Privilege Man- 
agement Infrastructure (PMI) to enforce copyright protection in elec- 
tronic content distribution. The PMI is “global” in nature and thus per- 
mits customers to gain access to content on any appropriate device. The 
use of a PMI also allows delegation of access to content. A unique key 
encrypting scheme provides increased security over other methods of pro- 
tecting electronic content. 



1 Introduction 

The distribution of electronic content via the Internet is becoming more and more 
common. Electronic content includes such objects as electronic documents (e.g., 
PDF, Word documents), music (e.g., MP3), video (e.g., MPEG), and games. 
There are presently problems with this method of distribution. The most signifi- 
cant of these is that once the content has been downloaded it can easily be copied 
and re-distributed. Thus, enforcing copyright protection is difficult, particularly 
if the customer wishes to download the content for use when off-line or for use 
on multiple machines/devices. In addition, it is difficult to provide customers 
fine-grained access to content (e.g., it is difficult to allow individual customers 
to buy one article from a magazine), to reliably identify customers, and to allow 
customers to further delegate access to content in a controlled manner, when 
required. This proposal attempts to address these problems by taking advantage 
of some current solutions for authentication using a Public Key Infrastructure 
(PKI) (see [3] for an overview of PKI) as well as some new ideas using attribute 
certificates [9]. 

In order to provide a concrete example, this paper will describe a sample 
application that distributes PDF versions of magazine articles using a PDF 
viewer. Generalizations to other forms of electronic content should be relatively 
straightforward . 

While this solution, like most solutions for content distribution, does not 
prevent unauthorized distribution by determined and malicious legitimate cus- 
tomers (especially in a software environment), it does prevent the typical user 
from doing so, while still allowing online or off-line use, access from different 
devices, further delegation and fine-grained access control. 
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2 What Is a PMI? 

Within a Public Key Infrastructure a public key is bound to a user’s identity 
through the use of a certificate. For example, in an X.509 [6,7] based PKI, a 
Certification Authority (CA) will verify the identity of the user and that he/she 
actually has the private key associated with the claimed public key. If the test of 
identity and Proof-of-Possession pass, then the CA will place the public key along 
with the user’s identity in an X.509 public key certificate and sign this certificate 
using it’s private key. When any other entity wishes to verify a signature of or 
encrypt data for the user, he/she first verifies the signature on the certificate 
using the CA’s public key. If the signature verifies then the public key contained 
in the certificate can be used for the desired purpose. Thus, end entities need 
only trust the CA’s public key, typically achieved through some out-of-band 
method [2], in order to validate the certificates of other entities in the PKI. If 
an end entity trusts a CA and has obtained the CA’s public key in a way that 
guarantees its authenticity, the CA’s public key is known as the CA’s root key. 

A Privilege Management Infrastructure (PMI) [7] is similar to a PKI, ex- 
cept that instead of using public-key certificates to bind a user’s identity to a 
public key, an attribute certificate is used to bind an identity to certain rights 
or privileges. An Attribute Authority (AA) that wishes to grant a user certain 
privileges will codify the privileges (usually represented by an attribute-value 
pair) and place them in an attribute certificate with the user’s identity. The AA 
then signs the attribute certificate using its private key. When the user wishes 
to use those privileges to gain access to a protected resource he/she presents the 
attribute certificate to the entity controlling access (the “gatekeeper”). The gate- 
keeper will then authenticate the user and verify the signature on the attribute 
certificate using the AA’s root public key. The gatekeeper must have already 
established trust in the AA’s public key (again, typically achieved through some 
out-of-band mechanism). If the signature verifies and the attribute certificate 
contains the required attribute, the user is allowed access to the protected re- 
source. In our example, the PDF viewer will act as the gatekeeper. 

In a PKI, a CA can certify the public key of another subordinate CA, thus 
allowing end-entities that trust one CA to validate certificates of the subscribers 
in another CA domain. Similarly, an A A can grant privileges to another AA, thus 
allowing gatekeepers who trust one AA to accept attribute certificates issued by 
another AA. The gatekeeper must now verify that the intermediate AA has, in 
fact, been delegated authority to grant this privilege by the trusted AA. This 
process is referred to as delegation in [9] . We will also adopt that terminology. 

3 The Idea 

The idea is that a root Attribute Authority for a large Privilege Management 
Infrastructure (PMI) would control access to individual pieces of electronic con- 
tent. Each PDF viewer, for example, would have the root key for this PMI 
embedded within it and access to the document would not be granted unless a 
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valid attribute certificate for the customer within that PMI existed. In addition, 
each viewer would require an embedded CA root key for a PKI to authenticate 
users and also a master symmetric key for obtaining access to content. 

Thus, the content creator would encrypt each piece of electronic content. For 
example, the magazine could make articles from each issue available for purchase. 
Customers would authenticate themselves to the magazine website and pay for 
and download the articles desired. When the user wished to read the downloaded 
articles, the viewer would first require authentication of the customer and, if a 
valid attribute certificate existed, decrypt and display the article. 

The advantage of using this method of distributing electronic content instead 
of just encrypting the content for each customer using, for example, CMS [5] or 
PKCS #7 [10] is that if the content is encrypted directly for the customer, he/she 
can simply decrypt and distribute the pirated content. If the proposed method 
were used however, the PDF viewer, for example, would only decrypt the content 
upon authentication of the customer and could make the plaintext difficult to 
obtain (e.g. would not write the content to disk). 

There is, however, at least one potential problem with this scheme (see Sec- 
tion 5.1 below). This proposal will only make it more difficult for most legitimate 
customers to illegally gain access to or copy and distribute electronic content. 
Determined individuals (i.e., those with the ability to analyze executable code 
or those with access to the internal workings and components of their computer, 
device, etc.) will still be able to do bad things. Unfortunately, it appears that 
this will always be a property of e-content distribution schemes since at some 
point the plaintext content must appear somewhere on the customer’s machine. 
If someone has the ability to analyze how the plaintext was obtained or to gain 
direct access to the plaintext as it is being displayed, they will always able to 
compromise the system. 



4 The Architecture 

This section describes the proposed Privilege Management Infrastructure as well 
as the accompanying Public Key Infrastructure for authentication. 

4.1 The PMI 

In this architecture, the root key of the PMI is embedded in the PDF viewer (or 
the appropriate viewer for the type of content). It is envisioned that this root 
could be the root of a global PMI similar to the PKI roots that exist in web 
browsers. The root Attribute Authority would then issue an attribute certificate 
to the PDF viewer manufacturer indicating that it was authorized to produce 
PDF documents to be displayed by the viewer and that this privilege could be 
delegated. In a similar way, the root Attribute Authority could issue attribute 
certificates to any manufacturer of electronic content viewers. The PDF viewer 
manufacturer would then issue an attribute certificate to the magazine publisher 
indicating that it was authorized to produce PDF documents (i.e., that it could 
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to viewer 



Fig. 1. The PMI architecture 



authorize customers to have access to the encrypted content). The utility of 
having these layers of Attribute Authorities will be discussed further in Section 
6.3. The need for a PMI, and in particular attribute certificates, will be described 
in Section 5.4. 

Upon the purchase of an article, the customer would authenticate to the 
magazine publisher and provide it with a customer symmetric key. The maga- 
zine publisher would then issue an attribute certificate to the customer indicating 
that the customer was authorized to view the particular article and also including 
the content symmetric key used to encrypt the article, encrypted with the master 
symmetric key and then encrypted with the customer symmetric key. (The pur- 
pose of doubly encrypting the content symmetric key will be discussed further 
in Section 5.) The customer’s attribute certificate could also place restrictions 
on when or how the content is to be viewed and may or may not allow further 
delegation. For example, a university library may subscribe to the magazine and 
then provide access to all of its students. The encrypted article including the cus- 
tomer’s, the magazine publisher’s and the PDF viewer manufacturer’s attribute 
certificates would be delivered to the customer. 

When the customer wishes to read the article, the viewer would authenticate 
and receive the customer symmetric key from the customer, and also verify 
the validity of the customer’s, the magazine publisher’s and the PDF viewer 
manufacturer’s attribute certificates using its embedded PMI root key. If the 
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to viewer 



Fig. 2. The PKI architecture 



complete attribute certificate chain validated, the customer symmetric key and 
the master symmetric key would be used to decrypt the content symmetric key 
that would be used to decrypt the article, which would be displayed to the user. 

In this way, only legitimate customers that had paid for the article could 
view it. 



4.2 The PKI 

There are a number of possible PKI architectures that are compatible with the 
proposed PMI architecture. In fact any method of authenticating the customer 
could be used instead of a PKI. Here we describe one possible architecture. 

Since the viewer must be able to authenticate legitimate customers, it must 
have the root of a PKI embedded within it. Then the root CA could certify the 
PDF viewer manufacturer’s CA, which would in turn certify the customer (as 
well as, in this example, the magazine publisher). 



4.3 How It Could Work 

For example, when the customer downloads the PMI-enabled PDF viewer, he/ 
she would also be enrolled in the PDF viewer manufacturer’s PKI domain. (Al- 
ternatively, the customer could already belong to a PKI that could be chained to 
the PDF viewer manufacturer’s PKI domain.) Then, when the customer wishes 
to purchase a magazine article he/she first authenticates him/herself to the mag- 
azine publisher using standard Internet authentication techniques (SSL/TLS [4] 
or SPKM [1], for example), provides it with the customer symmetric key over 
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the established session, pays for the content and obtains the content and appro- 
priate attribute certificates. When he/she wishes to read the article, the PDF 
viewer would log the customer into their PKI identity (if he/she was not already 
logged in), authenticate the customer’s identity (using, for example, techniques 
in ISO/IEC 9798-3 [8]) and obtain the customer symmetric key from the cus- 
tomer in order to determine whether or not to allow access to the content. 
Thus, the sequence of events becomes: 

1. Customer downloads the PDF viewer. 

2. Customer enrolls in PDF viewer manufacturer’s PKL (If not already enrolled 
in a PKI.) 

3. Customer authenticates to the magazine publisher. 

4. Customer pays for the article and provides the magazine publisher with the 
customer symmetric key. 

5. Magazine publisher encrypts the article with the content symmetric key. 
(Could be done in advance.) 

6. Magazine publisher encrypts the content symmetric key with the master 
symmetric key to produce an encrypted symmetric key. (Could be done in 
advance.) 

7. Magazine publisher encrypts the encrypted symmetric key with the customer 
symmetric key to produce a doubly encrypted symmetric key. 

8. Magazine publisher Attribute Authority creates an attribute certificate for 
the customer containing the doubly encrypted symmetric key. 

9. Magazine publisher sends the encrypted article and the attribute certificates 
to the customer. 

10. Customer authenticates to the PDF viewer and provides the customer sym- 
metric key. 

11. The PDF viewer validates the attribute certificate. 

12. The PDF viewer decrypts the content symmetric key using the customer 
symmetric key and the master symmetric key. 

13. The PDF viewer decrypts the content using the content symmetric key. 

14. The PDF viewer displays the content. 

5 Security Issues 

5.1 Encrypting the Content 

Each piece of content would be encrypted with a unique content symmetric key. 
This encryption would only have to be performed once for each piece of content. 
The content symmetric key would then be encrypted with a master symmetric 
key and a customer symmetric key and placed within each customer’s attribute 
certificate. The master symmetric key would be embedded in each viewer to 
allow decryption of the content symmetric key and thus, the content. This key 
should also be different for each type of viewer (i.e. for each type of e-content). 
The customer symmetric key, which would also be required to obtain access to 
the content, should be stored securely for the legitimate customer in such a way 
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that it is portable to different machines/devices. A simple solution is to store 
it within the customer’s Personal Security Environment (PSE) [2]. A PSE is a 
software or token based secure storage for the customer’s private keys and other 
sensitive information. Most PSEs can be moved from one device to another. 

In this solution knowledge of both the master symmetric key and the cus- 
tomer symmetric key is required to gain access to the content. Thus, both must 
be securely protected. If the customer symmetric key is stored in the customer’s 
PSE, it should only be available to the legitimate customer. The master symmet- 
ric key must be embedded in each viewer, however, and thus must not be readily 
available by analysis of the viewer executable. This could be accomplished by 
implementing a function whose sole purpose is to decrypt keys encrypted with 
this master symmetric key value. In other words, the plaintext key need not 
appear in memory and need not be passed into a general purpose decryption 
algorithm. A function could be used that would only perform decryption with 
the given key. The key then wouldn’t need to actually appear in memory since 
bit operations could be used to optimize and obfuscate decryption with this 
code. Even so, the master symmetric key could become available to determined 
adversaries. However, unless they have the cooperation of a legitimate customer 
in order to acquire a customer symmetric key, they are no further ahead. Thus, 
determined legitimate customers may be able to get access to the plaintext, but 
this may not be preventable (see Section 5.2). 

Note that if the viewer is implemented in secure hardware, then it is highly 
unlikely that the master symmetric key could be obtained and thus the solution 
described in this paper is very secure. For this reason this solution is more 
applicable to hardware implementations. 

In addition, to support encryption the master symmetric key would have 
to be kept in a secure location so as not to be compromised. A Trusted Third 
Party (e.g. the root CA or root AA) could keep this key in secure hardware and 
encrypt content symmetric keys for content creators. The content creators (e.g. 
the magazine publisher) would authenticate themselves to the Trusted Third 
Party and present their attribute certificate indicating that they are legitimate 
creators. They could then provide the content symmetric key to the Trusted 
Third Party and it would be encrypted using the master symmetric key. Again, 
this operation would only need to be performed once for each piece of content. 
This encrypted key would then be encrypted again for each customer using the 
customer symmetric key. 

The content, master and customer symmetric keys could, in fact, all be asym- 
metric keys. However, it is recommended that symmetric key cryptography be 
used for these keys, to allow for more efficient operations at the server. 

5.2 Making Plaintext Unavailable 

Using the other methods described in this document to restrict access to elec- 
tronic content will not be successful if the decrypted content is somehow made 
available or stored on the user’s disk, allowing copying of the content and unau- 
thorized distribution. Thus, viewers should keep the plaintext in memory. How- 
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ever, even then, determined individuals could certainly read the content by scan- 
ning memory. Therefore, again, determined legitimate customers may be able to 
gain access to the plaintext content. 

In some applications the disclosing of plaintext content may not be unde- 
sirable for certain customers. In these cases, the customer’s attribute certificate 
could indicate whether or not the decrypted content should be made easily avail- 
able to them. 

5.3 Further Delegation 

One of the advantages of this scheme is that it allows further delegation of privi- 
lege to access the electronic content . Let us consider the example of a university 
library that wishes to grant access to magazine articles to its students. The li- 
brary has an attribute certificate containing its unique identifier, an indication 
of the privilege to view the content, and the content symmetric key encrypted 
with the master symmetric key and also the library’s customer symmetric key. In 
order to delegate access, the attribute certificate must also contain an indication 
that the library is in fact allowed to delegate access. 

When it does wish to delegate access to the magazine articles, the library will 
create an attribute certificate for each student to which access will be granted. 
The new attribute certificates will contain an identifier for the student to which 
access is granted, an indication of the privilege to view the content and the 
content symmetric key. The content symmetric key will be encrypted with the 
master symmetric key and the student’s customer symmetric key. The library 
can produce this encrypted key by taking the doubly encrypted key out of its 
attribute certificate, decrypting it using its own customer symmetric key (leaving 
the singly encrypted key) and then encrypting it with the student’s customer 
symmetric key. 

The library must obtain the student’s customer symmetric key in order to 
place the properly encrypted content symmetric key in the attribute certificate. 
Thus, it may make sense in these circumstances (and, in fact, any situation where 
the AA cannot be trusted with the customer symmetric key) for the customer 
to produce different symmetric keys for each application. 

5.4 Why Attribute Certificates? 

One may be tempted to not use attribute certificates at all in this type of scheme. 
Shouldn’t the presence of the content symmetric key encrypted with both the 
master and customer symmetric keys be enough evidence that the customer had 
been granted access to the e-content? 

Unfortunately, this is not the case. Since the outer encrypting of the con- 
tent symmetric key is performed using the customer symmetric key, a mali- 
cious customer could very easily remove this encryption and encrypt it with any 
other symmetric key, thus easily delegating access. This would be undesirable. 
Attribute certificates eliminate this security weakness by placing this doubly 
encrypted key inside a signed object that cannot be created by the customer. 
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Note that it is not feasible in a large scale environment to reverse the order 
of encrypting so that the outer encrypting is performed by the master sym- 
metric key. This change would require that the content symmetric key must be 
encrypted with both the customer and master symmetric key each time a cus- 
tomer was granted access. This would mean that the master symmetric key must 
be kept on-line which will decrease efficiency and could make the key vulnerable 
to attackers which attempt to break into the server in which it resides. With 
the present scheme, the content symmetric key need only be encrypted with the 
mater symmetric key once, and then encrypted with a customer symmetric key 
each time a customer is granted access. 

6 Other Issues 

6.1 Anonymous DNs 

A PKI could be used for authentication of customers. However, it is possible that 
some customers would not want their name or other vital information to appear 
in a widely available certificate. For such environments, it is recommended that 
anonymous DNs be used. In this example, the PDF viewer manufacturer’s CA 
may be required to keep a database linking the anonymous DNs with actual 
identity information. Also, naming rules must be enforced so that each customer 
receives a unique DN within this PKI. It may also be desirable for customers to 
have different certificates (and DNs) for each viewer for which he/she is regis- 
tered. 

6.2 Certificate Rollover 

In many cases it would be undesirable if a customer bought a song and after 6 
months he/she couldn’t use it because his/her public key certificate had expired. 
There are two possible solutions to this problem. One solution is to make cus- 
tomer certificates very long-lived (e.g. 10, 20 years). However, issuing long-lived 
public key certificates to end entities is discouraged in most environments for 
security reasons. 

A second solution is to make the key short-lived (e.g. 6 months or a year) and 
require that every few months users must re-connect to the Internet to contact 
the PDF viewer manufacturer’s CA and obtain a new public key certificate. A 
warning would have to be displayed when certificate expiry is approaching which 
advises customers of this requirement. This has the disadvantage that people who 
remain off-line for extended periods of time lose access to all of their electronic 
content. In order to link the attribute certificate issued to the customer with any 
public key certificate issued to that customer by the PDF viewer manufacturer’s 
CA, the customer should be identified by their DN in the attribute certificate. 

When the public key certificate of an attribute authority expires, however, 
all attributes issued by that authority can no longer be verified. Thus, attribute 
authorities must have keys that are very long lived (e.g. 20 years). 
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6.3 Are Global Root Keys Required? 

The description in this paper assumed for simplicity that there would be one 
global root key for the PMI that would be used for all types of electronic con- 
tent, one global root key for the PKI that would identify each customer of elec- 
tronic content, and one master symmetric key for decrypting content. This is not 
strictly necessary. The PDF viewer manufacturer, for example, could establish 
its own roots for the PKI and PMI and master symmetric key that would be 
embedded within each PDF viewer. In some situations, this configuration may 
be more desirable. 

7 A Comparison with Other Schemes 

This section will describe other possible solutions for distributing electronic con- 
tent and compare them with the solution proposed in this paper. 

7.1 Encrypting the Content Just for the Customer 

Another method of allowing the secure downloading of electronic content so that 
it is only accessible by the legitimate customer is to simply encrypt the content 
for the customer. The customer simply generates a (symmetric or asymmetric) 
key and sends it to the e-content distributor who encrypts the content for the 
user. This solution is conceptually simple and also allows the user to gain access 
to the content on different devices. However, it is now very easy for malicious 
customers to decrypt the content and distribute the plaintext. While it is also 
possible with the scheme described in this paper for malicious customers to gain 
access to plaintext by gaining access to the master symmetric key, it is much 
more difficult than simply performing a decryption using a key known to the 
customer. 

Similarly, it is possible for a malicious customer to sell his/her PSE and 
password, thus allowing others to obtain access to all content he/she has pur- 
chased. Customers will be deterred from doing this for two reasons. First, any 
one with access to the customer’s PSE would also be able to impersonate the 
customer, thus potentially incurring a large amount of costs for the customer. 
Secondly, if unauthorized redistribution of electronic content occurs on a large 
scale, the presence of the customer’s PSE among a large number of people allows 
authorities to trace the source back to the malicious customer. 

Instead of encrypting the content directly for the customer, an alternative 
solution is to encrypt it for the customer’s computer. A (symmetric or asym- 
metric) key could be generated on the customer’s computer and stored in such 
a way that it is only accessible on that computer. For example, it could be en- 
crypted by a key generated from unique data on the host computer. This makes 
it difficult for malicious customers to gain access to plaintext, but does not allow 
customers to view/play the content on different computers or devices. 

In addition, neither of these solutions allows secure delegation of access. 
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7.2 Encrypting the Content Just for the Viewer 

It may also be tempting to encrypt the content using just a key that is embedded 
in the viewer. This solution allows anyone with a copy of the viewer to have access 
to the content, however. This may make sense in situations where sales of the 
viewer are projected to be more important than sales of the content, but that 
business model is seldom the one envisioned in current and projected e-content 
distribution ventures. 

This solution also suffers from the problem that if someone is able to find the 
decryption key in the viewer and distribute it, unlimited access to all content 
for everyone may be available. 

While the solution described in this paper also relies upon a key embedded 
in the viewer, loss of this key does not immediately provide unlimited access to 
all content. Only a legitimate customer can obtain access. Thus, this solution 
provides additional security over simply encrypting content for the viewer, and 
also allows a more realistic business model. 

8 Conclusion 

This paper described a method for enforcing copyright protection on a per- 
customer basis. The described solution allows both online and off-line use, pro- 
vides customers access to content on any device that has the appropriate viewer, 
allows further delegation of access, and is secure except against very determined 
malicious legitimate customers. 
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Abstract. We demonstrate the existence of an efficient block cipher 
with the property that whenever it is composed with any non-perfect 
cipher, the resulting product is strictly more secure, against an ideal 
adversary, than the original cipher. We call this property universal se- 
curity amplification, and note that it holds trivially for a one-time pad 
(a stream cipher). However, as far as we are aware, this is the first ef- 
ficient block cipher with this property. Several practical implications of 
this result are considered. 



1 Introduction 

It is often asked in cryptography whether the product of two ciphers might 
be more or less secure than one of the ciphers by itself. An amplification of 
security doesn’t happen in general and important counterexamples have been 
identified. For example, if the permutations of a block cipher form a group (or 
more precisely, are uniformly distributed on a subgroup of the symmetric group 
on the set of message blocks), then two- key double encryption is no better than 
single encryption. Thus, it has been seen as important to rule out this pathology 
in the case of DES [4] . Furthermore, the security of a product can actually be 
less than that of the second cipher when the plaintext statistics are ill-behaved 
with respect to the permutations of the first cipher [14]. Nevertheless, depending 
on how security is measured and how the ciphers are modeled, other affirmative 
results have been advanced [20,8,1]. 

In this paper, we take a novel approach to this problem, raising a strong 
existence question about the security of product ciphers. Specifically, we ask: Is 
there an efficient block cipher which amplifies the security, against an ideal ad- 
versary, of every non-perfect cipher with which it is composed? By construction, 
we answer this question in the affirmative. 

The constructed cipher, as presented, would not be widely viewed as prac- 
tical because it requires a variable length key which grows with the amount of 
plaintext encrypted (much like a one-time pad). On the other hand, if a cryp- 
tographically strong substitute for the key were used (such as a key schedule, 
hash, or pseudo-random function), then the strength of the security amplification 
would be no worse than the strength of the key substitute. 
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There are other practical implications of our result. First of all, the techniques 
used here could facilitate the construction of computationally efficient primitives 
(such as dynamic S-boxes) with provably strong security properties. More gen- 
erally, if we are to understand, in more than purely heuristic terms, the security 
convergence of modern iterated cryptosystems, then our result establishes new 
limits on what can be accomplished in polynomial-time. Our construction might 
be modified and compromised to obtain faster ciphers with complementary se- 
curity results. 

2 Preliminaries 

A basic familiarity with random variables and probability spaces [11] is assumed. 
Some group theory [19] is also assumed, but in the next subsection, we shall 
review some important terminology about permutation groups [7]. 

2.1 Permutation Groups 

Let X be any set. The collection of all invertible functions on X forms the 
symmetric group &x- Any subgroup G < ©;f is called a permutation group, 
and we also say that G acts on X and that A is a G-set. The subgroup of G 
which fixes a point a: £ A is called the (point) stabilizer of x, and is given by 
Stabc(a;) = {h € G\hx = x} . 

When studying n-bit block ciphers, the finite set A4 = {0, 1}" of all n-bit 
binary strings (or equivalently the integers {0, 1, . . . , 2" — 1}) is the most natural 
G-set for some permutation group G < &m- But additionally for this paper, 
we will often consider two other actions of G on related sets. By we mean 
the set of tuples of size £ with distinct elements in Al, G acting elementwise. 
If p = (pi,...,pi) € the point stabilizer StabG(p) is sometimes written 

StabG(pi, ■ ■ ■ ,Pt)- By we mean the set of subsets of Ai of size m, where 

g G G acts on S' £ Af^™^ by taking S >->■ gS. The point stabilizer of S £ AJ^™^ 
is sometimes written StabcjS}. 

2.2 Shauuou’s Model aud Product Ciphers 

Following Shannon [20], we model an n-bit block cipher as a ©^-valued random 
variable. If a cipher X only takes values in a subgroup G < &m, then X may 
be called a G-cipher. We may model a stream cipher in the same spirit (cf. [15]). 
Let {0, 1}* denote the (infinite) set of finite binary strings, and let H < ©{q 
be the subgroup of length-preserving permutations. We shall call an Ff-valued 
random variable a stream cipher^ . By a cipher we mean either a block cipher or 
a stream cipher. 

^ In practice, a stream cipher will typically also have consistent block prefix action, 
i.e. for some integer n, it will be confined to permutations h £ H such that when 
|m| = ju'j £ nZ, h{uw) = u'w' implies that for all v of length jwj, h{uv) = u'v' for 
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Given two independent ciphers X and Y acting on the same message space, 
the cipher XY is called a product cipher, Y is called its first component and 
X is called its second component. The distribution of the product of two block 
ciphers is given by the convolution, 



where x{g) = P [X = g] and y{g) = P [F = g]- This representation of a product 
cipher will prove useful in the sequel. 

The cipher U which is uniformly distributed on &m is called the perfect ci- 
pher. For any subgroup G < &m the G-cipher Uq which is uniformly distributed 
on G is called the uniform G-cipher. Given an infinite sequence of independent 
and uniformly random bits, Zq,zi,..., we may form a simple stream cipher, 
called the one-time pad, by mapping plaintext word m into Z|m| ©m, where 2 |m| 
is the word zg • • • Z|„i| • 

2.3 The Computational Model 

Shannon’s model is a purely probabilistic one; it says very little about how a 
computer might transform plaintext into ciphertext and back. For a cipher X 
to be practical, there should be effective procedures for encryption (computing 
the action of X on plaintext) and decryption (computing the action of X~^ on 
ciphertext). 

One natural choice for the computational model is the standard Turing ma- 
chine model [9]. Informally, we have an encryption algorithm Enc, which has 
as input arguments the plaintext m and the random key k, and which outputs 
ciphertext c. The corresponding decryption algorithm Dec is similarly defined. 
Formally in this model, we require a pair of deterministic Turing machines E 
and D, such that (under suitable encoding) m = D(k,E(k,m)), for all m and 
k. Notice that under this model, all randomness enters as an argument to the 
encryption and decryption algorithms, or equivalently as input data on the Tur- 
ing machine tapes. Our view is that this model of computation is unnecessarily 
restrictive, because it fails to capture the simple idea that some ciphers (like 
the one-time pad) are “computationally efficient” even though they may require 
impractical amounts of key material to encrypt every possible plaintext. 

Alternatively, we consider encryption and decryption algorithms which access 
key material as an auxiliary subroutine call. Formally, such a subroutine call is 
idealized by an oracle function f : {0, 1}* — > {0, 1}, and we are thus invoking 
the computational model of an oracle Turing machine (OTM) [9]. An OTM is a 
deterministic Turing machine augmented by an oracle tape and additional logic 
so that that at any time, the oracle tape with input a written on it can, in one 
step of computation, be transformed to have f{a) written on it. An OTM M 
with specific oracle function / will be denoted by , and its time complexity 
is computed in the usual way (with oracle evaluation counting as one step). We 
may model uncertainty about the oracle function by treating it as an instance 
of a random oracle function F : {0, 1}* — > {0, 1}. 




( 1 ) 



/iSG 
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The next two definitions capture our intuitive notion of efficient encryp- 
tion/decryption for block and stream ciphers, respectively. 

Definition 1 (Efficient Block Ciphers). An ensemble of block ciphers {Xn}, 
n G N will be called computable in polynomial-time if there exists a random 
oracle function F and a pair of polynomial-time OTM’s, E and D, such that for 
each n G N.' (i). for each p G {0, 1}", p = D^{E^{p)), and (ii). the distribution 
of E^ , restricted to strings of length n, is identical to that of Xn, and (Hi), the 
distribution of , restricted to strings of length n, is identical to that of X~^. 

By a mild but common abuse of notation, a block cipher X acting on {0, 1}" 
will be called computable in polynomial-time if it is one of an ensemble of such 
ciphers, and any important properties hold for each representative. 

Definition 2 (Efficient Stream Ciphers). A stream cipher X will be called 
computable in polynomial-time if there exists a random oracle function 
E and a pair of polynomial-time OTM’s, E and D, such that: (i). for each 
p G {0, 1}*, p = D^{E^{p)), and (ii). the distribution of E^ is identical to that 
of X , and (Hi), the distribution of is identical that of X~^ . 

Note that by Definitions 1 and 2, both the one-time pad and the Luby- 
Rackoff construction [15] are efficient. In fact, each is computable in linear time. 
Notice also that being computable in polynomial-time does not preclude that 
exponentially many bits may be necessary to completely describe the cipher’s 
action on the entire message space. For example, each round of the Luby-Rackoff 
construction (a Feistel cipher with a perfectly random function acting on half- 
words) takes on one of 




distinct permutations of an n-bit message space. Thus, for the common 3-round 
version of the construction, there must be ) bits to entirely describe it. 

However, neither the one-time pad nor the Luby-Rackoff construction meets 
our objective. The one-time pad is not a block cipher. Furthermore, every per- 
mutation of the Luby-Rackoff construction is even and hence is confined to a 
proper subgroup (the alternating group, Six < &m)i and we shall see from 
Lemma 1 below that it cannot be a universal security amplifier. 

2.4 Optimal Chosen Plaintext Attacks 

We now introduce the measure of security in terms of which strict security in- 
equalities will be derived. Informally, it is just the average cost of the optimal 
(non-adaptive) chosen plaintext attack for an adversary in possession of an or- 
acle which will answer the question, “is X = gl” . There are two stages to the 
optimal strategy. First the adversary discards all permutations which are incon- 
sistent with the acquired plaintext-ciphertext pairs. Then among the remaining 
permutations, he queries the oracle for the exact permutation in order of non- 
increasing probability. The adversary will obviously choose the plaintexts such 
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that the average cost of this strategy is minimized. The difficulty of this attack 
is a direct and meaningful measure of the cipher’s security. 

To formally quantify this attack against a G-cipher X, G < 6m^ let us 
assume that the adversary has collected i plaintexts and their corresponding 
ciphertexts into tuples p,c € respectively. The ciphertext tuple c is an 

instance of the random variable = Xp, whose uncertainty is due exclusively 
to uncertainty about X. Now for any random variable Z the average cost of 
guessing its value is called the guesswork^ of Z and is given by 

m 

w{z)^Y.mh ( 2 ) 

i=l 

where Z takes on m values, and where the probabilities of Z have been arranged 
according to p[i] > p[j] for all i < j. For fixed p and c, the conditional guess- 
work W{X\c,p) is the guesswork of X as in Equation (2) after discarding all 
permutations g G G such that c gp, and then rearranging and rescaling the 
probabilities accordingly. Now we must still account for the uncertainty about 
G^. Evidently, for a particular choice of plaintext tuple p, the cost of the at- 
tack must be weighted by the a posteriori probabilities uj{c\p) = P \C^ = c | p] , 
yielding 

W{X\G\p)= Y. W{X\c,p)u:{c\p). (3) 

The minimum value of W{X\C^,p) is the optimal chosen plaintext attack work 
factor, which will be denoted 

ee{X)= min W{X\G^p). (4) 

For continuity we take 0o{X) to be W(X). 

3 The Main Result 

3.1 The Existence Theorem 

We shall prove by construction the following theorem. 

Theorem 1. There is a cipher X, computable in polynomial-time, such that for 
each 0 < £ < 2^ and every independent cipher Y , 9i{XY) > 9i(Y). Furthermore, 
equality holds ijf 9g(Y) = 9^(11). 

It is easily seen (see e.g. [17]) that no non-perfect cipher Y can have 9i{Y) = 
9e{U), for all i. Thus this theorem tells us in a very meaningful way, that every 
non-perfect cipher is brought closer to the the perfect cipher by left multiplica- 
tion by X. 

The proof of Theorem 1 relies on three lemmas which treat different aspects 
of the problem. To express these lemmas succinctly, we introduce some additional 

^ Guesswork has sometimes been called guessing entropy, cf. [18] and [3] 
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terminology. First, the support of a G-cipher (or indeed any random variable) 
may be defined as supp(X) = {g ^ G\P [X = g] ^ Q} . Second, it is useful to 

denote the size of the smallest Gmessage stabilizer of a group by Mc{i) = 
|StabG(p)|. It is easily seen that Oi{Uo) = |[1 + Mc{£)]- 
The first lemma from [17] treats the case £ = 0 but is also useful in estab- 
lishing the other results. 

Lemma 1. Given G < &m> X he a G-cipher. Every independent non- 
uniform G-cipher Y satisfies W{XY) > W(Y), iff for each g € G and each 
subgroup H G, supp(X) % gH . 

The next lemma provides sufficient conditions for nearly universal amplification 
(£ > 0) for ciphers in any permutation group. 

Lemma 2. For a permutation group G < &m, X be a G-cipher such that 
supp(Jf) = G. Then for each 1 < £ < 2" and every independent G-cipher Y , 
OfiXY) > 9t{Y). Furthermore, equality holds iff 6i{Y) = 0i{Uc)- 

The final lemma asserts the existence of a cipher suitable to translate Lemmas 
1 and 2 into Theorem 1. 

Lemma 3. There is a cipher X , computable in polynomial-time, with supp(X) 
= &M- 

Assuming the validity of the above lemmas, the proof of Theorem 1 is immediate. 

The proof of Lemma 2 is rather involved and is sketched in Sect. 4.4. Most 
of the rest of this paper is devoted to the construction of X and the proof of 
Lemma 3. Before diving into the precise details in Sect. 4, let us first take a 
slightly more informal look at the ideas underlying this construction. 

3.2 An Intuitive Glimpse at the Construction 

The symmetric group on the message space is truly enormous. It’s size is ap- 
proximated by 

loglog(2”!) Rin-l-log(n) = 0(n). 

Because it takes two logarithms to bring 2"'! down to the polynomial n, our 
construction will exhibit two distinct sources of algorithmic efficiency: 

1 . Recursion: The cipher X will be recursively defined as the product of simpler 
ciphers. More precisely, the encryption algorithm Enc will itself be recursive 
but will also call another recursive algorithm invSort. The decryption algo- 
rithm Dec will be similarly defined. The time complexity and recursion depth 
of each algorithm will be a polynomial in n. 

2. Oblivious Action^: The cipher X will be representable as the product of 
a large number of random powers of transpositions (i.e. permutations of 
message blocks two at a time). Then Enc and Dec, the defining algorithms of 
X, will make use of only polynomially many transpositions for every block 
encrypted. 

® We borrow this term from [16] where it is used in the same context 
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Let G < &M be any permutation group. There are many ways to construct 
a product cipher PQ which achieves every permutation in G, even though both 
P and Q are sparse on G. Indeed for any subgroup H < G, we may take Q 
which achieves every permutation in H and P which achieves one permutation 
in every left coset of H in G. It is easy to see that PQ achieves every permu- 
tation in G. For many large groups, it is possible to find subgroups satisfying 
|G| \P[\ and |G| [G : H], Formally, we have an amplification of support: 

|supp(PQ)| |supp(P)|, and |supp(P(5)| ^ |supp(Q)|. Thus by exploiting the 
algebraic structure of the group, we may construct a densely distributed cipher 
as a product of very sparsely distributed ciphers. 

Let’s try to carry this idea even further. Consider a chain of subgroups of G 

{1} = i/o < < • • • < = G, 

and for each i, an iLi-cipher Pi which contains one permutation in every left 
coset of Hi-i in Hi. Then by simple induction, the product cipher Pm ■ ■ ■ P 2 P 1 , 
would have complete support on G. For example in the symmetric group on 2” 
symbols, consider the subgroups Hi = Stab(l, . . . , 2” — z), 0 < * < 2”. On the 
one hand, this choice of subgroups is promising because the number of cosets in 
© 2 " of the largest proper subgroup is the polynomial n. Unfortunately however, 
there are 2" subgroups in this chain, and so the number of terms in the product 
Pm • ■ ■ P 2 P 1 grows exponentially with n. If we are to employ this technique, it 
may be inconvenient to use a chain of subgroups which fix collections of words 
in Ad - either as tuples or as sets - because any hierarchy of such collections 
would typically be as large as Ad itself. 

It thus makes more sense to define subgroups which fix some feature of the 
words in Ad . To that end define Ki to be the subgroup consisting of the permu- 
tations of &M which preserve the first n — i bits of each message block. We shall 
call Ki the (n — i)-bit prefix stabilizer subgroup of &Mi as i ranges from 0 
to n these form the chain of subgroups 

{l}=Ko<K,<---<Kr, = &M- (5) 

We will construct, for each 1 < i < n, a ATi-cipher Pi which contains one per- 
mutation in every left coset of in Ki. Then the cipher of Lemma 3 will be 
defined as 

X = Pr,---P2Pl. (6) 

But let us compute the minimal support required of P„. That is to say let us 
count the number of left cosets of Kn-i in &m- Since Kn-i permutes all but 
the most significant bit of words in Ad, the left cosets of AT„_i are characterized 
by the rearrangements of Ad with distinct patterns of the most significant bit. 
There are precisely 

[6^ : A"„_i] = 

of these rearrangements. Observe that while we have reduced the number of 
permutations by a large number (by (2"“^!)^ in fact), on a doubly logarithmic 
scale we still have 
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Table 1. A randomly chosen arrangement of {0, 1, . . . , 7} is sorted with respect to the 
most significant bit after the application of only 3 rounds of disjoint transpositions. 
The first two columns indicate the initial arrangement (position, value). The next three 
columns give, for each round, the transposition affecting the value at that position and 
subsequent arrangement 







round (transp./arrangemnt.) 






0 


1 


2 


7 


2 = 


= 010 


(67) 


100 


0 


100 


0 


100 


6 


4 = 


= 100 


(67) 


010 


(46) 


111 


0 


111 


5 


1 = 


= 001 


0 


001 


0 


001 


(15) 


101 


4 


7 = 


= 111 


0 


111 


(46) 


010 


(04) 


no 


3 


0 = 


= 000 


(23) 


101 


(13) 


on 


0 


on 


2 


5 = 


= 101 


(23) 


000 


0 


000 


0 


000 


1 


3 = 


= Oil 


0 


oil 


(13) 


101 


(15) 


001 


0 


6 = 


= 110 


0 


110 


0 


no 


(04) 


010 



log log 



/ 2 " 



n + 1 = 0(n). 



It may appear that we are right back where we started, yet we have transported 
the problem onto very fertile new ground. 

The efficiency in our algorithms for Pi has its heritage in the closely related 
problem of card shuffling. In fact, both the security of a product cipher [17] 
and the fairness of a shuffled deck of cards [2,6] is related to the uniformity of 
convolutions as in (1). In their now famous analysis riffle shuffles, Aldous and 
Diaconis remarked that “the lovely new idea here is to consider shuffling as in- 
verse sorting.” [2, Remark (a), p. 344]. Indeed it is quite natural to consider 
encryption as inverse sorting because the rearrangements of Ai which character- 
ize the left cosets of ATn-i < &m correspond precisely with the permutations 
which would be used in the first step of the obvious recursive sorting algorithm. 
In the reverse order, we may achieve all permutations of A4 by first achieving 
all rearrangements of the most significant bit, and then proceeding recursively 
with the less significant bits. What we claim is that sorting and inverse sorting 
on the most significant bit can be done in polynomial-time using both recursion 
and the oblivious action of transpositions. The rest is gravy. 

Let us demonstrate this efficiency in a simple example with n = 3 and thus 
M = {0,1,..., 7}. We start with a random arrangement (6, 3, 5, 0, 7, 1, 4, 2) of 
the elements of Ai, and attempt to sort this tuple on the most significant bit by 
the application of n = 3 rounds of involutions (recall that every involution is a 
product of disjoint transpositions) . For reasons of efficiency we shall restrict our- 
selves to transpositions of the form {j, j©2®), with i constant for every round. The 
allowable round involutions are (01)^'^(23)^^(45)^^(67)^^, (02)^® (13)^® (46)^’’ (57)^® 
and (04)^’’ (15)^® (26)^1® (37)^“^^ for rounds 0, 1 and 2, respectively. Table 1 be- 
low shows that we can indeed sort on the most significant bit of 2" integers by 
carefully choosing the powers bi in only only n rounds. 
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Fig. 1. The structure of the cipher X — P 3 P 2 P 1 for n = 3. The rounds are applied 
left to right, and each round corresponds to a random involution shown as a vertical 
column of butterflies. Each butterfly in the diagram represents a random transposition 
of the form {k,k © 2*)^, where parallel lines indicate b — 0 and a crossover indicates 
b= 1 



To overcome the limitations of having so few permutations, our strategy is as 
follows: the goal at the end of round 1, is to collect integers with leading 1 into 
the lowest part of the bottom half (those positions < 3), and to collect integers 
with leading 0 into the lowest part of the top half (those positions > 4). Then 
the powers of the transpositions in the final round (round 2) are determined by 
the sorting requirement. We claim that this strategy will work for all n. 



4 The Construction Details 

In the next two subsections we present the detailed construction of the cipher 
X of Lemma 3. In the following two subsections we prove Lemma 3 and sketch 
the proof of Lemma 2, respectively. 



4.1 Algebraic Details 

We may encrypt by inverting the sorting procedure described in the previous 
section. Formally, for any j, define to be the product of independent and 
uniformly random powers of the 2"“^ distinct transpositions of the form (fc, k © 
2*), with 0 < fc < 2" — 1. Then let 



p p(d p('0 p('0 

and as before X = Pn' ■ ■ P 2 P\- Each random involution corresponds to a 
“round” as shown in Fig. 1 below. Note that while there is repetition (e.g. 
and are i.i.d. random variables), X is not a traditional iterated cryptosys- 
tems because the specific sequence of rounds is carefully chosen. 
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4.2 Algorithm Details 

It is clear that we may recursively affect the actions of X and X~^ on any 
block, if we can carry out the rounds in the correct order and in such a way 
that the powers of all relevant transpositions are independent and equiprobable. 
Moreover, if we encounter the the same butterfly in two different executions, 
we must be able to reproduce the same random power of the corresponding 
transposition. This is easily accomplished if we consider the random bits to 
be indexed by AJ x Z. The resulting function f : M x Z — {0, 1} is easily 
transformed into a random oracle function F : {0, 1}* — > {0, 1} appropriate 
for Definition 1. We employ the convention that the power of any transposition 
{k,k 0 2*) is /(m,r), where m = min{fc,/c 0 2*} and r is the round. In other 
words, / is applied to the lower left hand corner of every butterfly in Fig. 1. 

The next algorithm implements encryption. To encrypt a single plaintext 
block, the computational complexity will be n{n + l)/2 or O For a block 

size of n = 128, this yields about 8, 256 operations. 

Algorithm 1. Defines recursive encryption functions Enc and invSort. The ac- 
tion of X = Pn - ■ ■ P 2 P 1 on p € Xi is affected by {q, r) = Enc(p, n, 1), such that 
q = Xp. The action of Pi on p € XI is affected by {q,r) = invSort(p, f — 1,*), 
such that q = Pip. 





function invSort(p, j, ?') •' 
q = p(B2T 
ii p < q then 
b = f{p,r). 
else 


function Enc{p,i,r) : 


b = f{q,r). 


if f > 1 then 


endif 


(q,r) = Enc{p,i- l,r). 


if 6 = 0 then 


endif 


q = p. 


return invSort(g,z — l,r). 


endif 

if j > 0 then 

return invSort(g,j — l,r 0 1). 
else 

return (q, r 0 1). 
endif 



The decryption algorithm is easily obtained by performing the the transpositions 
in the reverse order. The necessary modifications are immediate, and we shall 
call the “reverse” of inverse-sorting fwdSort. 

Algorithm 2. Defines recursive decryption functions Dec and fwdSort. The ac- 
tion of X~^ = Pf^Pf^ ■ ■ ■ P~^ onp G XI is affected by {q, r) =■ Dec(p, n, |n(n0 
1)), such that q = X~^p. The action of P~^ on p G Xi is affected by {q,r) = 
fwdSort(p, i — 1, *), such that q = P~^p. 
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function fwdSort(p, g, r) : 
if j > 0 then 

(p, r) = fwdSort(p,j — l,r). 

endif 


function Dec(p, i,r).- 


g = p 0 2-^ . 


(g, r) = fwdSort(g, i — 1, r) . 


if p < g then 


if i > 1 then 


b = f{p, r). 


return Dec(g, i — l,r). 


else 


else 


b = f{q,r). 


return (g, r). 


endif 


endif 


if 6 = 0 then 

return (p, r — 1). 
else 

return (g, r — 1). 
endif 



Remark 1. Notice how the round information is explicitly carried by input- 
output argument r through the entire recursion processed. During the execution 
of Enc, it is incremented, while during the execution of Dec it is decremented. 
This is necessary because encryption and decryption must agree on the random 
bits /(p, r) which determine the appropriate powers of the various transpositions 
involved. □ 

4.3 The Proof of Lemma 3 

To prove Lemma 3 we must first develop some terminology and prove some 
preliminary results. Recall that the integers in Ad will have a dual role as n-bit 
strings. When treating prefixes and other substrings it is useful to have a padding 
function tTj : h — > {0, 1}* taking j to the binary representation of j mod 2* 
padded up to i bits. Also define a prefix truncation function Ti : {0, 1}* — > 
{0, 1}® taking binary word w to its first i bits (the most significant i bits). 

It is natural for us to recursively partition M. into disjoint subsets which 
share the same prefix. For example, let So = {i £ M. |Ti(i) = 0} and Si = 
{i £ M \ Ti{i) = 1} , so that M is the disjoint union U S'!. More generally, let 
(j) = {k £ A4 \ Ti{k) = TTi{j)} with 1 < j < 2® — 1, and again we partition Ad 
into disjoint subsets 

2®-l 

■^ = U 

3=0 

The prefix stabilizers are naturally expressed in terms of these subsets, for ex- 
ample clearly K^-i = Stabiy^jiSo} D Stabiy^jS'i}, and more generally 

2*-l 

Kn-^ = Pi Stabiy^{S'^.(j)}. 

3=0 

The following proposition characterizes the left cosets of Kn-i < Kn- 
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Proposition 1. A left coset of Kn-i in Kn is completely determined by the 
image of So under the action of any left coset representative. 

Proof. First of all iFn-i = Stabi^^j^o} fl Stabi^^l^i} = Stabif„{S'o}, because 
anything which fixes So must also fix Now Kn = &m acts transitively on 
the set ^ of all subsets of Ai of half its size. By standard group action 

arguments [19,7], the left cosets {gKn-i} are in one-to-one correspondence with 
the images {giSo}, in a well-defined way. □ 

We shall derive presently a similar characterization of the left cosets of 
Kn-i-i in Kn-i. First let’s agree that whenever A C B we will consider &a 
to be a subgroup of &b. Recall [19] that if a group G factors into product 
G = HK of normal subgroups H and A', with H D K = {1}, then G is a direct 
product of H and AT (it is literally isomorphic to the Cartesian product with the 
obvious group law). Clearly whenever A is a disjoint union of Ai and A 2 , &b 
contains the direct product &Ai&A 2 - Visibly, AT„_i = &So^Si, and if we write 
= ©5^ we also have that Kn-i is the direct product 

2‘-l 

Kn-i = ®7Ti(j)- 

3=0 



Proposition 2. A left coset of Kn-i-i in Kn-i is completely determined by the 
images o/S',ri(j)0) 0 < j < 2* — 1, under the action of any left coset representative. 

Proof. Because Kn-i is the direct product given above, a left coset gKn-i-i 
factors into a product of left cosets 

2'-l 

9j ^^tabg^^^^.j jq} n Stab0^^^^.j . 

3=0 

However, we again have 

Stab0^^^^.j Fl Stab0^^^^^ (j) 1 } ^tab0^^^^.^'[*S',ri(j)o}- 

Finally, 2® invocations of Prop. 1 obtains the desired result. □ 

With this machinery in place, we may now prove Lemma 3. 

Proof (of Lemma 3). Recall that in order to facilitate the induction argument of 
Sect. 3.2, thereby establishing that supp(A) = ©ai, we must show that (for each 
i) supp(P„_i) contains a representative of each left coset of Kn-i-i in Kn-i. 

What we’ll actually show, by an inner induction argument, is that for every 
subset S C Ai contiguous on each (0 < j < 2® — 1) and every possible 

image T of S under the action of Kn-i (he., every T of the form gS for some 
g G Kn-i), supp(P„_i) contains a permutation g taking S i-P- T. Since each 
is trivially a contiguous subset of S'.n-i(i), we have the desired result by 
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Prop. 2. Note also that if we can take an arbitrary contiguous set to an arbitrary 
image, then we can also take an arbitrary complement of a contiguous set to an 
arbitrary image. 

Induction Base: Clearly Ki is isomorphic to the direct product of 2”“^ 
symmetric groups on 2 elements (cyclic groups of order 2), and thus has size 
\Ki\ = 2^" . Since |supp(Pi)| = 2^" also, the induction hypothesis holds 

trivially. 

Induction Step: Without loss of generality, we consider the case i = 0. By hy- 
pothesis, supp(P„_i) contains an element of K^-i taking any contiguous subset 
of S'o to a desired image (c 5 'q, and of the same size), while simultaneously tak- 
ing any contiguous subset of to a desired image (again C ^i, and of the same 
size). Choose arbitrary sets U C Sq,V C S'!, let T = UUV, and choose any con- 
tiguous set S' C Ad of size \T\. Again without loss of generality, we may assume 
that |SnSo| > \U\ (because otherwise |SnSi| > \V\ and a completely symmetric 
argument applies). We must show that supp(P„) = supp(P„_i)supp(i?^’!^j) con- 
tains a g such that gS = T. Write g = hka, with h G Ssp, fc G &Sn and where 
a is some product of transpositions of the form {j,j 0 2”“^). Evidently the real 
job of a is to send elements of S fl So in excess of \U\ across the most significant 
bit boundary into Si, because h,k € Stab/y„{So} cannot do this later on. The 
transpositions in supp(i?^’^j^), which flip the most significant bit, are perfect for 
this task. Let a be the product of the transpositions {j,j 0 2"'“^), with j G J, 
where J consists of the highest |S fl Sq| — \ U\ elements of S fl Sq. We claim that 
(aS) n So is a contiguous subset of Sq, and that (aS) fl Si is either a contiguous 
subset or the complement of a contiguous subset of Si. Assuming that is true, 
then by the induction hypothesis, we may choose h taking (oS) fl So i-T 17 and 
k taking (aS) fl Si U, so that gS = hk{aS) = T. 

Two cases naturally arise. {Case 1:) If S doesn’t intersect with Si then a 
takes J contiguously to some image in the middle of Si, and a leaves S — J 
contiguously in the middle of So. {Case 2:) On the other hand, if S intersects 
non-trivially with Si, then because S is contiguous, J is precisely the highest 
|Sn Sol — |17| elements of So itself, and furthermore SO Si consists of the lowest 
|S n Si| elements of Si which are left fixed by a. Therefore (aS) fl Si consists 
of the complement of a contiguous set (those elements between S fl Si and aJ). 
But again a leaves (SflSo) — J contiguously in the middle of So. This completes 
the induction step for i = 0. 

Applying this same argument within the appropriate direct product sub- 
groups when i > 0 yields the inner induction step and thus completes the proof. 

□ 



Remark 2. The previous proof seems harrowing with 2 cases nested inside 2 
w.l.o.g.’s nested inside of 2 layers of induction. But, it is in essence just a rigorous 
form of the more intuitive sorting example given in the previous section (which 
may have seemed simpler at first glance). □ 
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4.4 Sketch of the Proof of Lemma 2 

In this section, we shall sketch the proof of Lemma 2 with the help of some 
preliminary results. 

The following inequality from [17] is essentially the translation into guesswork 
terminology of a nice result attributed to Day [5] about majorization for sums 
of real vectors (see [13] and [17]). Given any m vectors . . . , € M", 
m doubly stochastic n x n matrices Di, . . . , D^m and m positive real numbers 
oJi, . ■ . we have 



C m \ m 

(7) 

Our technique for quantifying and comparing various performance values of 
the optimal chosen plaintext attack typically starts with a fixed I and fixed 
p € We then proceed to study how the cipher’s structure affects the ex- 

pression of (3). A simple but useful observation is that the conditional guesswork 
W(y|c, p) is completely determined by the distribution of Y on some coset of 
the stabilizer H = StabG(p). Let k = [G ■. H] and fix a set of left coset 

representatives of H in G. It is useful to treat the distribution y{g) — P\Y = g] 
as giant vector in RG (the real vector space spanned by G) which decomposes 
into to a direct sum of left coset component vectors as described below. 

A mathematically succinctly way to handle this decomposition of y, especially 
when we need to study the effect of multiplication by X, is to exploit the fact 
that the vector space RG is also the real group algebra generated by G and is 
isomorphic as a left RG-module to the induced representation from H to G by 
RiJ given by the tensor product of modules^ 



k 

RG ^ RG R77 = 0 Pi 0 Ri7, (8) 

i=l 



Our treatment of induced representations follows Jacobson’s [12] and is aimed at 
succinctness. Briefly, given a ring B, a right J3-module U, and a left J3-module V, 
one forms the tensor product of modules T = U V in a. way which is completely 
analogous to the case of vector spaces, except that now we have ® u = u ® bv, 
b £ B. In general, T is only a Z-module, but when [/ is a A-B-bimodule for some 
ring A (i.e. f/ is a left A-module as well as a right J3-module), T becomes a left 
A-module with left multiplication defined by a{u 0 u) = au®v. In this way, for any 
R7/-module V, RG V becomes a RG-module called the representation induced 
from H to G by V . 

The reader who is unfamiliar with the module approach to representations is 
encouraged to begin with a more constructive definition of the induced representation 
(such as in [10]) and work out the relatively inelegant details for left multiplication 
by a cipher X 
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where the isomorphism takes gth gi® h, and thus takes 

k k k / 

y(9)g = y{gih)gih ^y = y{gih)h 

geG i=l heH i=l i=l \heH 

Note that P [1^ = gih] = y{gih) = yl^\ and so the vectors really describe 

the distribution of Y on the cosets of iJ, even though each one is an element of 
MiJ. Now, it follows directly from the definition that 

k 

W{Y\C\p) = Y,W{y^^). 

i=l 

Now recall that the product Z = XY of two independent G-ciphers X and 
Y has distribution P \XY = g] = x * y{g), which from (1) has the form of a 
matrix multiplication. Indeed using the direct sum decomposition of the induced 
representation given in (8), we shall derive the block structure of this matrix. 
Using this structure we shall compare the distribution within the appropriate 
cosets of H for XY vs. Y. The key is to represent U by y as in (9), but to leave 
X as a convex sum of the permutations in G weighted by x{g) = P \X = g\. We 
aim to derive the form of Z represented by z € RG ®«,h Rif, again as in (9) . 

Now any g G G acts by left multiplication on any gj ® v G RG ®kh Rff 
according to g{gj ® v) = gi ® hv, where ggjH = giH , so that h G H is uniquely 
determined by ggj = gth. Thus we have that 




z = '^gi® = I X ^(9)g ) 1 X ® ^ 

i=l 



U) 



\geG 






= X X ^^9)g {gj ® y^^^)- 

i = l \geG / 

For any particular i we may collect together contributions to direct summand 

gi ® Rif, 

k 

= ^ X x{g) g (gj y^^^) 

ggjH^giH 

k 

= X X ^(g)(9i®^iAg)y^^^) 

j=i g&Tij 



= ffi <H) 1 X X ^(g)^v(9) 



M) 



where Fij = {g G G \ggjH = giH} and (y) = g^ ggj. Thus, 



f=i 



( 10 ) 
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where 

= E (11) 

and where ujij = x^Fij). Notice that the sum in (11) is a convex sum of permu- 
tations in H, hence each Dij takes the form of a doubly stochastic matrix under 
a suitable ordering of H (the basis vectors of K7J). Furthermore, it can easily 
be shown that the values of Uij are the elements of a doubly stochastic matrix 
[17]. Also note that Fa = < G for each i. The true core of Lemma 2 is the 

following proposition. 

Proposition 3. For a permutation group G < &mi let X andY be independent 
G-ciphers such that supp(A) = G. For any p £ such that Y is non-uniform 

on at least one left coset ofStahcip), we have W{XY\G^ ^p) > W{Y\C^,p). 



Proof. Let p satisfy the assumption of the proposition, and let z represent the 
distribution of the product Z = XY as above. Let be non-uniform and 
consider That is to say, let us focus on this one submatrix block on 

the diagonal of the larger doubly stochastic matrix representing the convolution 
z = X * y. 

Since Fjj = Fl^F we may rewrite Djj as 



D 



J3 





E 

heH 



x{hSi) 

x{Haiy- 



Thus we see that by scaling appropriately, Djjy^a) 
of two independent iJ-ciphers XY, with P X = h 



has the form of a product 
= P[X = h33]/x{H33), and 



P 




P [y = gjh] /y{gjp[). But since supp(X) = G and conjugation by 



gj yields an isomorphism of FI < — y Fjj, supp(A') is not confined to any proper 
coset of H, and we may invoke Lemma 1 to obtain W{XY) > W{Y) or more 
importantly for our purposes, W{Djjy^a)'j > W{y^^l). 

Note that we may bound any kF(z(®^) by the inequality of (7) as 



= Wij2 > Y. 



\Tn—l 



m—1 



but by using W(Djjy^^l) > W{y^^l) and (7) again, we may strictly bound 
W(z^a)'j as follows 

W{Z^^1) = w(y 

\m-l / 

k 

m—1 

k 

> Y 

m—1 
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(Note that in both cases we have used, for doubly stochastic D, the inequal- 
ity W{Dv) > W{v) which follows from simple majorization arguments [17]). 
Combining these bounds on we obtain a strict bound on W{Z\C^,p) as 

follows 



W{Z\C^,p) 



i—1 m—1 

k k 

m—1 2=1 

k 

= E = W{Y\C\p), 

m—1 



which was to be proved. □ 

Remark 3. Evidently, in the previous proposition, we could weaken the condition 
supp(X) = G to: For every p G supp(X) fl StabG(p) is not confined to a 

proper coset of StabG(p). However, for our purposes in this paper, it was not 
necessary to use the weaker condition. □ 

The next proposition provides an important interpretation of the situation 
when a cipher is uniform on every coset of an ^-message stabilizer. 

Proposition 4. Let Y be a G-cipher, for a permutation group G < &m ■ For 
any p G write H = StabG(p) and we have 

W{Y\C\p) < 

with equality holding iff Y is uniform on each coset of H . 

Proof. For c G with io{c\p) ^ 0, 

1 < H^(F|c,p) < 

because W{Y\c,p) is the guesswork on a coset of size |i7|. Furthermore, equality 
in the upper bound is achieved iff Y has constant probability on that particular 
coset [17]. Now since oj{c\p) = 1, the sum from (3) 

W{Y\G^,p)= W{Y\c,pMc\p) 



is convex and therefore achieves its maximum of 1(1 -I- jiJj) iff F is constant on 
each coset of H {Y will of course have the constant probability 0 on those cosets 
corresponding to w(cjp) = 0). □ 
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By tying together the previous two propositions, we may finally prove 
Lemma 2. 



Proof (of Lemma 2). Again, let us write Z = XY. Suppose there is a p G 
such that 6^{Z) = W{Z\C^,p) and Y is non-uniform on at least one coset of 
Stabg(p). Then we may invoke Prop. 3 to obtain 

9i{Z) = W{Z\C^,p) > W{Y\C^,p) > 9,{Y). 

On the other hand, suppose that for every p € satisfying 9i(fZ') = 

W(ZjC^,p), Y is uniform on each coset of StabG(p). Let H = Stabcfy) for any 
such p. By (10), Z is uniform on each coset of H as well, and by Prop. 4, 

9i{Z) = W{Z\C^,p) = 

Now choose any p with |StabG(p)| = Mc{i) and hence 



9e{Z) 



l±M<w(z\c\p)< 



1 + Mcji) 

2 



forcing |iL| = Mc{£), and thus 9g{Z) = 9i{Ug)- Then, either 9i(Y) fy 9i{Ug), in 
which case 9i{Z) > 9i(Y), or 9i{Y) = 9i{Ug)- 

To summarize what we have proved thus far, 9^{Z) > 9i(Y) and if equal- 
ity holds then 9i(Y) = 9(^{Ug)- However conversely, if 9g{Y) = 9i{Ug)^ then 
9i{Ug) > 9i{Z) > 9i(Y) = 9i{Ug), forcing equality 9i{Z) = 9i(Y), which com- 
pletes the proof. □ 



5 Conclusion 

The issue of security amplification by product composition remains a complex 
one. In this paper, we have added to the number of situations where a definite 
answer can be given. Specifically, Theorem 1 asserts that there exists efficient 
cipher X such that the security of XY is strictly greater than Y unless Y is 
perfect. There is room for further improvement in this result. For example, a 
more efficient cipher might be constructed which makes use of a weakened form 
of Lemma 2 as discussed in Remark 3. Additionally, our implementation might 
be optimized for bulk encryption. 

The cipher we construct to prove Theorem 1 is costly in some ways but has 
other desirable properties. Unlike a one-time pad, if the key were replaced by 
a pseudo-random source, a known plaintext-ciphertext block would not trivially 
betray the key used for that block. This property could be useful in constructing 
provably secure practical encryption systems. Also observe that our construction 
is not an iterated cryptosystem but rather a product of independent rounds 
with a carefully chosen order. The techniques employed here might be a useful 
new paradigm for practical cryptosystems with key schedules instead of a truly 
random source of key material. 
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Finally we note that due to Lemma 2 and the nature of our existence question 
we have been content to focus on strict inequalities and strict amplification of 
support alone. While an infinitesimally small increase from Oi(Y) to 9i{XY) 
is possible, techniques beyond the scope of this paper have been developed to 
establish much stronger claims of amplification. Ongoing research suggests that 
the cipher X of Lemma 3 has stronger security properties than required by 
Theorem 1. 
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Decorrelation over Infinite Domains: 
The Encrypted CBC-MAC Case 
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Abstract. Decorrelation theory has recently been proposed in order to 
address the security of block ciphers and other cryptographic primitives 
over a finite domain. We show here how to extend it to infinite domains, 
which can be used in the Message Authentication Code (MAC) case. 

In 1994, Bellare, Kilian and Rogaway proved that CBC-MAC is secure 
when the input length is fixed. This has been extended by Petrank and 
Rackoff in 1997 with a variable length. 

In this paper, we prove a result similar to Petrank and Rackoff’s one by 
using decorrelation theory. This leads to a slightly improved result and 
a more compact proof. 

This result is meant to be a general proving technique for security, which 
can be compared to the approach which was announced by Maurer at 
CRYPTO’99. 



Decorrelation theory has recently been introduced. (See references [17] to [22].) 
Its first aim was to address provable security in the area of block ciphers in 
order to prove their security against differential [7] and linear cryptanalysis [10]. 
As a matter of fact, these techniques have also been used in order to prove 
Luby-Rackoff -like pseudorandomness results [9] in a way similar to Patarin’s 
“coefficient H method” [14,15]. All previous cases however address random func- 
tions over a finite domain, which is not appropriate for MACs. 

The CBC-MAC construction is well known in order to make Message Au- 
thentication Codes from a block cipher in Cipher Block Chaining mode. Namely, 
if C is a permutation defined on a block space {0,1}™, for a message x = 
(toi, . . . , rui) G ({0, 1}™)^ we define 

MAC(x) = C{C{. . . C{mi) -b TO2 • • •) + m^). 

The addition is traditionally the XOR operation but can be replaced by any 
group (or even quasigroup) law. In 1994, Bellare, Kilian and Rogaway proved 
that if C is a uniformly distributed random permutation, then for any integer 
I and any distinguisher between MAC and a truly random function which is 
limited to d queries, the advantage is less than 3d^.^^2“™ [6]. This shows that 
no adaptive attack can forge a new valid (a;,MAC(a;)) pair with a relevant prob- 
ability unless the total number of known blocks d£ is within the order of 2^ . 
This however holds when all messages have the fixed length i. If the attacker is 
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allowed to use messages with different length, it is easy to notice that for any 
message x and any block a the MAC of x concatenated with a — MAC(x) is 

MAC(x, a - MAC(x)) = C{a) 

which does not depend on x and allows to forge a new authenticated message 
by replacement of x. 

In 1997, Petrank and Rackoff addressed the case of DMAC defined by 

MAC(x) = C 2 {Ci{Ci{. . . Ci(mi) + m2 . . .) + me)) 

(see [16]). This type of construction does not mean any originality since it is 
already suggested by several standards [2,3,4]. Its security was however formally 
proved in [16] for the first time. 

If we replace C2 by C2oC]"^ we can obviously remove the last Ci application. 
We can thus consider the MAC defined by 

MAC(x) = C'2(Ci(. . . Ci(mi) + m2 . . .) + me) 

which we call the “encrypted CBC-MAC” in the sequel. In this paper we give a 
security proof which is different from [16] and with a slightly improved reduc- 
tion. Our proof also happens to be more compact (it is less than 2-page long), 
thanks to use of the decorrelation theory tools. Our approach is also more gen- 
eral and can be applied to other schemes. In this way it can be compared to 
the information theoretic general approach which was announced by Maurer at 
CRYPTO’99 [12]. 

1 Prerequisite 

1.1 Definitions and Notations 

First of all, for any random function F from a set At 1 to a set At 2 and any 
integer d we associate the “d-wise distribution matrix” which is denoted [F]‘^, 
defined in the matrix set by 

= Pr[-P(a:i) = 2/i, ■ • ■ , F{xd) = yd]- 

Given a metric structure D in define the distance between 

the matrices associated to two random functions F and G. This is the “d-wise 
decorrelation distance” . If G is a random function uniformly distributed in the set 
of all functions from A4i to Al2 (we let F* denote such a function), this distance 
is called the “d-wise decorrelation bias of function F” and denoted DecF^(F). 
When F is a permutation (which will usually be denoted C as for “Cipher”) 
and G is a uniformly distributed permutation (denoted G*) it is called the “d- 
wise decorrelation bias of permutation F” and denoted DecP^(F). In previous 
results we used the metric structures defined by the norms denoted ||.||2 (see 
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[18]), |||.|||oo, IMIa, |M|s (see [21]). These four norms are matrix norms, which 
means that they are norms on with the property that 

||AxB||<||A||.||i?||. 

This property leads to non-trivial inequalities which can shorten many treat- 
ments on the security of conventional cryptography. 

Given two random functions F and G from A4i to Ad 2 we call “distinguisher 
between F and G” any oracle Turing machine which can send Adi-element 
queries to the oracle O and receive Ad 2 -element responses, and which finally 
outputs 0 or 1 . In particular the Turing machine can be probabilistic. In the fol- 
lowing, the number of queries to the oracle will be limited to d. The distributions 
on F and G induces a distribution on and A^ , thus we can compute the 
probability that these probabilistic Turing machines output 1. The advantage 
for distinguishing F from G is 

Adv^(J^, G) = Pr [A’" ^ 1] - Pr [A^ l] . 

For any class of distinguishers Cl we will denote 

Advci(A, G) = max Adv^(F, G). 

.agci 

We notice that if A is a distinguisher, we can always define a complementary 
distinguisher A = 1 — A which gives the opposite output. There is no need 
for investigating the minimum advantage when the class is closed under the 
complement (which is the case of the above class) since 

Adv_ 4 (F,G) = -Adv^(F,G). 

We consider the class Clf of all (adaptive) distinguishers limited to d queries. 

1.2 Properties 

The d-wise distribution matrices have the property that if F and G are indepen- 
dent random functions, F from AI 2 to AI 3 and G from Adi to AI 2 , then 

[F o G]'^ = [G]^ X [F]'^. 

Thus, if we are using a matrix norm ||.||, we obtain 

DecF|j.||(FoG)<DecF|j.||(F).DecF[j,||(G). 

and the same for permutations. 

The ||.||a norm defined in [21] has the quite interesting property that it 
characterizes the best advantage of a distinguisher in Clf. 

Lemma 1 ([21]). For any random functions F and G we have 
||[F]'^-[G]‘^||, = 2.Advc,.(F,G). 
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In this paper, we will use the ||.||a norm only and omit it in the notations. 
Finally we recall the following lemma. 

Lemma 2 ([21]). Let d he an integer, F\, . . . ,Fj. he r random function oracles, 
and Cl, . . . ,Cs he s random permutation oracles. We let Q he a deterministic 
oracle Turing machine which can access to the previous oracles and an input tape 
X. It defines a random function G{x) = I2{Fi, . . . , F^, C\, . . . , Cs){x). We assume 
that fl is such that the number of queries to F) is limited to some integer ai, 
and the number of queries to Cj is limited to bj in total for any i = 1, ... ,r and 
any j = 1, . . . , s. We let the F* (resp. C* ) be independent uniformly distributed 
random functions (resp. permutations) on the same range than Fi (resp. Cj) and 
we let G* = fi{Ff, . . . , Ff , C( , . . . , C*) . We have 

r s 

DecF^(G) < ^DecF“^‘^(Fi) + ^ DecP*'^^(Gj) + DecF‘^(G*). 

i=i 

This lemma actually separates the problem of studying the decorrelation bias 
of a construction scheme into the problem of studying the decorrelation biases 
of its internal functions F) and Cj and studying the decorrelation bias of an 
idealized version G*. 

1.3 The Coefficient H Method 

Patarin introduced the “coefficient H method” which enables to make pseudo- 
randomness proofs more systematic. In the decorrelation theory setting, this 
method can be formalized by the following lemma. 

Lemma 3 ([22]). Let d he an integer. Let F be a random function from a set 
Ail to a set M. 2 - We let X be the subset of Aif of all {xi, . . . ,xj) with pairwise 
different entries. We let F* be a uniformly distributed random function from 
Ail to Ai 2 . We assume there exist a subset y C Ai^ and two positive numbers 
Cl and £2 such that 

- \yMAi2)-^>l-ei 

-VxGX yyey [F]ly>{l-e2MAi2)-'^. 

Then we have DecF'^(F') < 2ei -|- 2e2. 

This lemma intuitively means that if [F]^ ^ is close to [F*]^ y for all x and almost 
all y, then the decorrelation bias of F is small. It is quite straightforward with 
techniques inspired by Patarin [14,15] and Maurer [11]. 

As an illustration. Lemma 3 can be used in order to prove the famous Luby- 
Rackoff Theorem easily as shown in Appendix. 

Theorem 4 (Luby-Rackoff 1986 [9]). Let Ff,F 2 ,Ff be three independent 
random functions on {0,1}"2' with uniform distribution. We have 



BecF‘^{F{Ff,Ff,F;)) < 2d‘^.2~'^ 
DecP‘^{F{Ff,Ff,Ff)) < 2d^.2~'^. 
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The results hold for Feistel schemes defined from any (quasi)group operation^. 

2 Decorrelation Biases of Functions 
over an Infinite Domain 

In order to define decorrelation biases of MACs, we need to address the problem 
of having infinite sets. Let for instance be a random function defined from A4* 
to At 2 (At* is the set of all finite sequences with entries in Ati). We define the 
matrix with rows defined on x ... x A4f‘‘ and columns defined 
on At 2 . Next we define DecF'^^’ '’'^'^(J^) as the distance between and 

[F*]91’ ->'3<i, where F* has a uniform distribution. Additionally, we can define 

DecF^’«(F)= max DecF«i’-’«‘^(F). 
qi + ...+qd=g 

We can easily check that all previous results remain valid for these definitions, 
namely: 

~ The best advantage of a distinguisher limited to d (adaptively) chosen queries 
with a total length of q blocks between F and F* is |DecF‘^’'^(F). 

— As in Lemma 2, if G = I2(Fi, . . . , Fj., F [, . . . , F') uses functions F) and Fj on 
fixed input length, but with occurrence numbers of Oi£ and bj respectively 
where £ is the length of the input of G, we have 

r s 

DecF'^’«(G) < ^DecF“^«(F,) + ^ DecF'’^'^(Fj) + DecF‘^’«(G*). 

i=l j = l 

We can use permutations Ci and G' as well and have DecP instead of DecF, 
or even mixtures of functions and permutations. 

— Lemma 3 still holds with DecF'^’'^ instead of DecF'^ and X equal to the set 
of (xi, . . . ,Xd) with total length q. 



3 Security of MAC 

Message Authentication Codes (MAC) are functions which map any binary 
string onto a fixed length value^ with a secret key. In this paper, we consider 
functions defined on the set ({0,1}™)* of finite sequences of m-bit integers^. For 

^ Here 'F{Ff ^ Ff , Ff) is the standard notation for a Feistel cipher with three rounds 
and round functions Ff , Ff , Ff 

^ More precisely, the MAC is the output of the function, but we will improperly call 
the function a MAC 

^ Note that arbitrary bit strings do not always have an integral number of blocks. 
For this we must use a padding scheme like the Merkle-Damgard [8,13] one in order 
to transform an arbitrary string into a string with an integral number of blocks. In 
this paper we prove the security for padded messages which induces the security for 
the whole scheme with the padding scheme 
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instance, given a block cipher EnCic which is a permutation on {0, 1}"* defined 
from a secret key K, we consider the CBC-MAC construction defined by 



MACiy(mi, . . . , me) = EnCic(EnCiy(. . . EnCic(mi) + m 2 . . .) + me). 

Since the secret key K is unknown by the opponent and chosen at random 
by the legitimate user, we can consider equivalently C = Enc^f as a random 
permutation with a given publicly known distribution, and the MAC itself as a 
random function. 

The purpose of MACs is to authenticate messages. Namely, the legitimate 
authenticator provides MAC(cc) is order to authenticate a message x. Saying that 
a MAC is (d, q, p)-secure means that for any opponent who can use the legitimate 
authenticator as an oracle for at most d — 1 chosen messages Xi,. . . ,Xd-i and 
issue an (xd, c) pair such that Xd yf Xi for any i and that the total length of 
xi, . . . ,Xd is of q m-bit blocks, the probability that c = MAC(xd) is less than p. 
This is the security against adaptive existential forgery attacks. 

We notice that if MAC is such that DecF'^’'^(MAC) = e, then it is a (d, g, 
2“"* + |)-secure MAC. Namely, for any opponent we can make a distinguisher 
who just query the forged Xd and check whether the output is c or not. Since the 
advantage must be less than |, the probability of success of the opponent must 
be less than | plus the probability of success against a truly random function, 
which is 2“™. Hence we use DecF‘^’'^(MAC) upper bounds as security evidences. 

For instance, we can consider the Bellare-Kilian-Rogaway result which works 
with a fixed input length £. 

Theorem 5 (Bellare-Kilian-Rogaway 1994 [6]). For any fixed integer £, we 
consider the function MAC defined on £ m-hit blocks from a uniformly distributed 
random function F* as follows. 



MAC(mi, . . . , mfi) = F*{F*{. . . F*{mi) + m 2 . . .) + me). 

For any d we have DecF'^(MAC) < 6d^£^2“"*. This holds for any (quasi) group 
addition. 

Here is another result which is quite similar to the An-Bellare result [5]. 

Theorem 6 ([22]). Let F\ and F 2 be two independent random functions from 
{0, 1}'’+™ to {0, 1}^. For any £ and any (mi, . . . , me) € ({0, 1}™)^ we define 



MAC(mi, . . .,me) = F 2 {Fi{. . . Ei(Fi(0, mi), m 2 ) . . .,me),£) 



where 0 means a b-bit zero string, and £ means an m-bit string which represents 
the £ value. Considering distinguishers limited to d queries and a total length of 
qm bits we have 

DecF‘'’« < DecF«(Ei) + DecF'^(E 2 ) + q{q - 1)2-™. 

Finally, here is the Petrank-Rackoff [16] result. 
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Theorem 7 (Petrank-RackofF [16]). Let Cl and C 2 be two independent ran- 
dom permutations on {0, 1}™ with the same distribution C. For any i and any 
(mi, . . . , mi) G ({0, 1}™)'^ we define 

MAC(mi, . . .,mi) = C' 2 (Ci(C'i(. . . C'i(C'i(mi) + m 2 ) ■■■ + mt-i) + mi)). 

Considering adaptive distinguishers limited to d queries and a total length of qm 
bits we have 

DecF‘'’«(MAC) < 2DecP'?(C') + V2"™. 

The result holds for any (quasi)group addition. 

4 Encrypted CBC-MAC 

Here is our main result. 

Theorem 8. Let Ci and C 2 be two independent random permutations over 
{0, 1}™. For any i and any (mi, . . . , mi) G ({0, 1}™) we define 

MAC(mi, . . . , mi) = C 2 (Ci(. . . Ci(Ci(mi) + m 2 ) . . . + m^-i) + mi). 

Considering adaptive distinguishers limited to d queries and a total length of qm 
bits we have 

DecF‘^’«(MAC) < DecP'?”'^(C'i) + DecP‘'(C' 2 ) 

+d(d - 1)2-™ + q(q + 1)(1 + g2-™)2-™. 

The result holds for any (quasi)group addition. 

This result is slightly better than Theorem 7. 

Proof. Lemma 2 reduces to the case where Ci and C 2 are independent uniformly 
distributed random permutations. 

Using Lemma 3, let y be the set of all y = (t/i, . . . , yd) with different y^s. We 
thus have 

, 2™^ ^ d(d-l) ^-^ 

^ 2"*(2™ - 1) . . . (2™ - d+ 1) “ 2 

Now for any collection of Xi = (mi^i, . . . , mi^q.) we let 

Uij = Ci(.. . Ci(Ci(m,_^i) + mi^2) ■ ■■ + ^ij-i) + Wj- 

We consider the event E that all Ui^q. are pairwise different. We have 

[MAC]^[y > Pr[MAC(a;i) = yi',i = l,...,d and E] 

= Pr[MAC(a;i) = y^ i = 1, . . . , d/E] Pr]^;] 

“ 2"*(2™ - 1) . . . (2™ - d+ 1) 

> - Pr[Aj) 
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therefore we can take C 2 = Pr[if] = Pr[3i < r; Ui^q^ = Ur,q^\- 

The remaining part of the proof consists of upper bounding C 2 by + 

and applying Lemma 3. 

We call a collision an event Ui^j = Ur^s- This collision is trivial if we have 
(m^p, . . . , mi j) = (m^p, . . . , mr,s) and non-trivial otherwise. Let Inv be the 
event that Ci{Uij) = 0 for some i,j, and let Coll be the event that we have a 
non-trivial collision. We can easily show that the E event is included in InvUColl: 
if then either rrii^q. ^ 'mr,qr and it is a non-trivial collision, or it 

reduces to Ui^q^-i = U^^q^-i and we can iterate... Thus 62 < Pr[Inv] -|-Pr[Coll]. 

The probability that any adaptive attack against C\ finds a preimage of 0 
after q — d queries is obviously less than 2 ^-q ■ Thus Pr[Inv] < ^ . 

We let U be the set of all C/ij-indices, which means the set of all {i,j) such 
that 1 < i < c? and I < j < qi- For ACUwe let c{A) be 

c(^) = {(i, j); 3(r, s) £ A i = r and j < s}. 

Thus c{A) is the set the indices of all Uij which are required in order to compute 
all Ur^s values for (r, s) S A. We define an ordering on 2^ by 

A<B c{A) C c{B). 

We let I be the set of all indices pairs of potential non-trivial collisions 
Uij = Ur,s, namely the set of all pairs {(*, j), (r, s)} of W-elements such that 
(m^p, . . . ,mij) ^ (m^p, . . ■,mr,s)- For any i,j,r, s such that {(i, j), (r, s)} e X 
we let Collij-_r,s be the event of the collision Uij = Ur,s (which is necessarily 
non-trivial since {(i, j), (r, s)} G I), and we let MinCollip^^_s be the comple- 
mentary in Colb^j_r,s of the union of all Collp for {{i' ,j'),{r' ,s')} G X 
and {(i'j j'), (r'j s')} < {(i, j), (r, s)}, i.e. the event = fZr,s with no prior 
non-trivial collision. We easily notice that 

Coll= IJ MinColb,j-^,s. 

We have at most terms in X. Hence 

PrfColll < ^ max PrlMinColb r gl. 

For {(i, j), (r, s)| G X, let us consider the MinColli^j,r,s event. We assume 
without loss of generality that s < j. Since we have no prior collision we must 
have nii j ^ rrir^s- Furthermore we must have Uij-i 7 ^ C/r,s-i because C\ is a 
permutation (otherwise Ci{Uij-i) + rriij cannot be equal to Ci(f7r-,s-i) + 'mr,s) 
and } > 1 , and we need to consider the event 

— — Ur,s- 

If we have a collision Uij-i = with {i,j — 1) 7 ^ (*^/) and (*', j') G 

c{i,j, r, s), it must be trivial (otherwise the initial collision is not minimal) which 
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means j' = j — 1 and i' = r ^ i and (rrii,i , . . . , rriij-i) = {rrir^i , . . . , rrirj-i). If 
s < j we have Ui^ = C/r,s and Ur,s = Ui^a thus = Ui^s which is non-trivial, 
which contradicts the minimality of the initial collision. Thus we must have 
s = j, but the trivial collision Uij-i = Urj-i then contradicts Cfjj-i yf Ur,s-i- 
Therefore Uij-i is equal to no Ui>j' for {i',j') G c{i,j,r,s)\{{i,j — 1)}. This 
implies that the marginal distribution of Ci(Uij-i) with the knowledge of all 
previous Ui>j/ is uniform among a set of at least 2™ — g + 1 elements. Hence 
Pr[MinColbj,r.s] < 25^- 
Finally we obtain 



£2 < 




q{q-l) 



1 

2"* - q 



< g(g+^) (i + g2-”")2- 



Applying Lemma 3 now completes the proof. 



□ 



5 Extensions 

In our result we notice that since d < q, the bound is small until q reaches 
the order of 2^ . This result is tight since usual collision attacks can break 
our construction within this complexity. Actually, we can query 2 t two-block 
messages until we get a collision MAC(mi, TO2) = MAC(m'2, then query c = 
MAC(toi, m2, m3) and output a forged authenticated message ((m'^, m^, m3), c). 
We have d = 2^ -I- 2 and q = 2.2^ -|- 6 and p ~ 1 — e~^. 

We may think that since we have an m-bit MAC and a security of 2 "a" uses we 
have an efficiency loss in term of storage. We can improve this construction by 
shrinking the MAC on ^ bits as suggested in most of standards. More precisely, 
let F be a random function from {0, 1}"* to {0, 1}^. We can define 

MAC(mi, . . . , mi) = F{C{. . . C{C{mi) + m2 ) ... -I- m^_i) -I- me) 

and we have 

DecF‘^’«(MAC) < DecP«(C) -k DecF"*(F) -k q{q + 1)(1 -k g2-™)2-™. 

(In the proof of Theorem 8, we take y equal to the full set so that ei = 0.) 

If we now want to shorten the two keys, we can replace the independent C 
and F random functions by dependent ones. Let ||[C'F]‘? — [C'oFo]'^||a denote the 
decorrelation distance between the (C, F) pair and a pair (Cq, Fb) of independent 
random functions such that Co (resp. Fq) has the same distribution than C (resp. 
F). This is half of the best advantage for distinguishing them from q queries. 
We should still consider DecP®“‘^(C) and DecF'^(F). So, even if C and F are 
dependent, we still have the following result. 

Theorem 9. Let C and Co he two identieally distributed random permutations 
on {0, 1}"* and let F and Fq be two identieally distributed random funetions from 
{0,1}™ to {0,1}^. We assume that Co and Fq are independent. For any I and 
any (mi, . . . , me) € ({ 0, 1}™) we define 

MAC(mi, . . . , me) = F{C{. . . C{C{mi) + m 2 ) . . . -k mg-i) + mi). 
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Considering adaptive distinguishers limited to d queries and a total length of qm 
bits we have 

DecF'^’«(MAC) < ||[C'A]« - [Coi^ol'^IU + DecP«"‘'(C) + 

DecF'^(A) + q{q + 1)(1 + g2-™)2-™. 

The result holds for any (quasi)group addition. 

This theorem clearly separates the security issues induced by the probabilistic 
dependence between C and F, the C algorithm, the F algorithm, and the MAC 
scheme. 

As an example we can use 

C{x) = DESic(x) and F{x) = Trunc(DES/c+c(a^)) 

for a given constant c, and where Trunc truncates a 64-bit string onto its first 
half and DES is the Data Encryption Standard [1]. We get a MAC on 6 = 32 
bits with a single 56-bit key and block of m = 64 bits. We obtain 

DecF^’«(MAC) < /(g) -k g(g -k 1)(1 -k q2-^^)2~^^ 

where /(g) is the sum of the best advantages for distinguishing 

— (DESif, Trunc o DESif+c) from (DESif^ , Trunc o DES^y^ ) 

— DES from C* 

— Trunc o DES from F* 

within a total number of query blocks less than g. Let g = 02 t (which is a limit 
of 320GB of queries). The advantage of any distinguisher is less than 
thus the probability of success of any adaptive existential forgery attack is less 
than 2~^^ + . Let us conjecture that / < 2“"^. If we authenticate 

less than 3GB, the probability of success of the best attack is less than 1%. 

The Advanced Encryption Standard will soon provide better security with 
m = 128. 

It shall however be outlined that this example is a little misleading since 
we do not assume any computational bound on the distinguisher which can 
thus perform an exhaustive search. This means that the conjecture is wrong. 
We can still modify the result and the computational model by limiting the 
time complexity to t. All reductions in this paper introduce simulators (like for 
instance a simulator for the MAC given an oracle for DES) which induce a small 
time complexity overhead which is often denoted 0(1). As a result we obtain 

( 9^2 \ _7 

qif ) ^ 2 , the probability 

of success of any attack which is limited to a complexity of t — 0(1) is less than 
1% after having authenticated 3GB. 
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6 Conclusion 

We have shown that the regular CBC-MAC construction provides a secure MAC 
when the output is encrypted. The security analysis suggests that if m is the 
block length of the underlying block cipher, then we should not use the MAC 
construction on more than 2~^ blocks in total. 

In order to fit to the security, we can even reduce the MAC length down to ^ 
bits, and shorten the key with extra security hypothesis. This enables to prove 
the security of existing standards. 

These results are quite similar than the Petrank-Rackoff ones. Our technique 
based on decorrelation theory is however quite systematic and can be applied to 
most of current MAC constructions with compact proofs. 

Finally, we believe that these techniques will contribute to making systematic 
proof analysis of cryptographic schemes and ultimately lead to some automatic 
security validation procedures. 

A Proof of Theorem 4 

Following the Feistel scheme F = F|, Fg ), we let 

Xi = {zi,zD 
z^ = z° + F*{zl) 

Vi = 

We let E be the event zf = zj + F^izf) and zf = zf + F^{zf) for i = 1, . . . , d. 
We thus have [F]'^ y = Pr[F]. We now define 

y = {{yi,...,yd);yi < j zf z^} . 

We can easily check that y fulfill the requirements of Lemma 3. Firstly we have 
13^1 > (^1 - 2™'^ 

thus we let ei = . Second, for y Gy and any x (with pairwise different 

entries), we need to consider [F]f. y. Let be the event that all zfs are pairwise 
different over the distribution of F*. We have 

[F]ly>PT[E/E^]Pr[E^]. 

For computing Pr[F/F^] we know that zfs are pairwise different, as for the zfs. 
Hence Pr[F/F^] = 2“™'^. It is then straightforward that Pr[F^] > 1— 2~^ 

which is 1 — € 2 - We thus obtain from Lemma 3 that DecF'^(F) < 2d{d— 1)2“ t. 
From Lemma 3 it is straightforward that DecF'^(C*) < d{d — 1)2“"*. We thus 
obtain DecP*^(F) < 2d^2~^ for d < 2^+^. Since DecF is always less than 2, it 
also holds for larger d. □ 
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Abstract. Hash functions play an essential role in many areas of cryp- 
tographic applications such as digital signature, authentication, and key 
derivation. In this paper, we propose a new hash function with variable 
output length, namely HAS-V, to meet the needs of various security lev- 
els desired among different applications. A great deal of attention was 
paid to balance the characteristics of security and performance. The use 
of message expansion, 4-variable Boolean functions, variable and fixed 
amounts of shifts, and interrelated parallel lines provide a high level of 
security for HAS-V. Experiments show that HAS-V is about 19% faster 
than SHA-1, 31% faster than RIPEMD-160, and 26% faster than HAVAL 
on a Pentium PC. 



1 Introduction 

A hash function is a function that maps an input with an arbitrary length to 
an output with a specific length, referred to as a hash-code. A one-way hash 
function must obey the preimage and second preimage resistance properties. 
Furthermore, most cryptographic applications require the hash function to sat- 
isfy the collision resistance property, which is a stronger constraint than the 
former two properties. 

The collision of a hash function can be found by the birthday paradox or 
square root attack with 2”/^ operations where n is the length of the hash-code 
[18]. In order to prevent such attacks, the length of the hash-code should be 
no less than 128 bits. However, the works of van Oorschot and Wiener [12], on 
special-purpose hardware design for parallel collision search, suggest that the 
minimum length of the hash-code should be 160 bits. Ever since Damgard [6] 
established the design principles of a hash function, which included the fact that 
the collision resistance of the compression function is sufficient for the collision 
resistance of the hash function, almost all hash functions follow these principles. 

There are three main categories of hash functions, namely hash functions 
based on block ciphers, hash functions based on modular arithmetic, and ded- 
icated hash functions [13]. Most early hash functions were based on block ci- 
phers. However, the modification of block ciphers into hash functions resulted 
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in security weaknesses in collision search as well as performance deterioration. 
The slow performance problem is even more serious for hash functions based on 
modular arithmetic and serious doubts have been raised about their security. 
Consequently, the need for fast and secure hash functions resulted in dedicated 
hash functions. These functions are custom designed to achieve the goals of 
a cryptographic hash function. Among the numerous dedicated hash functions 
that are in use today, the MD4-family hash functions are the most widely used 
and analyzed family of hash functions. MD4 [14], MD5 [15], and RIPEMD-160 
[9] are popular examples of MD4-family hash functions. 

There have been several attacks on the MD4-family hash functions [1,2, 5, 7]. 
Among these attacks, a series of Dobbertin’s attacks are becoming a real threat 
on practical applications. Fortunately, SHA-1 [11] and RIPEMD-160 are consid- 
ered to be secure against these attacks [8]. The main distinction of SHA-1 is the 
message expansion process, where the message words used in the different rounds 
are computed as the sum of the previous message words and circular shift by 
1-bit. This prevents making local changes, which is confined to a few bits, and 
accordingly individual message bits influence the calculations at large number 
of places. RIPEMD-160 is an enhanced version, in a way to be resistant against 
Dobbertin’s attacks, of RIPEMD. Its main improvements are the increase in the 
number of rounds from 3 to 5 and the two parallel lines were modified to have 
a different message ordering. Boolean functions, and shift amounts. 

Recently, much progress has been made in the software implementation of 
MD4- family hash functions [3,4]. Analyses show that the structures of MD4- 
family hash functions possess a higher instruction-level parallelism than cur- 
rent general-purpose computer architecture can provide. Among the MD4-family 
hash functions, it is known that the critical path to compute the step function 
of SHA-1 is shorter than any other MD4-family hash functions and the organi- 
zation of RIPEMD-160 in two independent lines will become much useful in the 
near future. 

In the remainder of this paper, we propose a new MD4-family hash function 
that produces a variable length hash-code, namely HAS-V. In Section 2, we 
present details on why a hash function with variable length hash-code is needed. 
In Section 3, the terminologies and notations used are defined. In Section 4, a 
description of the newly proposed hash function is given. In Section 5, we discuss 
the underlying design principles of HAS-V based on performance and security 
aspects. Performance comparison is given in Section 6, and concluding remarks 
are given in Section 7. The pseudo-code and the test values of HAS-V are given 
in the Appendix. 

2 Motivation 

The length of the hash-code is an important factor directly connected to the 
security of the hash function. Assuming that there are no unexpected design 
flaws known in a hash function, the complexity of finding a collision is heavily 
dependent on the length of the hash-code. The length of the hash-code must 
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be long enough to provide explicit security, but it must not be unnecessarily 
long to sacrifice efficiency of the entire system. Consequently, the length of the 
hash-code is a relative factor to the computing power possessed by the opponent. 
Therefore, the length of the hash-code is meant to vary from time to time as 
technology advances and information security broadens its area of application. 
It seems inappropriate to fix the length of the hash-code when various levels of 
security are desired among different applications. 

KCDSA [10], Korea Certificate-based Digital Signature Algorithm, is an ex- 
ample of a cryptographic application where a variable length of hash-code is 
needed. KCDSA employs variable length domain parameters in order to fulfill 
the various security needs in different applications. In the case of KCDSA, there 
is a need for a hash function that can produce a variable hash-code of up to 256 
bits in order to fully utilize the flexible security level of KCDSA. 

However, most conventional hash algorithms are designed to produce a spe- 
cific length of hash-code, such as 128 bits for MD4 and MD5, and 160 bits for 
SHA-1 and RIPEMD-160. Among the well-known hash functions, HAVAL [19] is 
the only hash function that can produce a variable length hash-code. Although 
HAVAL is still considered to be secure, there are some concerns that a suit- 
able modification of MD4 attack could be applied to HAVAL with 3 passes. 
Furthermore, HAVAL suffers from performance deterioration in CISC proces- 
sors due to the excessive number of chaining variables used. There exists an 
optional extension of RIPEMD-128 and RIPEMD-160 to produce 256-bit and 
320-bit hash-code. However, these methods do not provide any increase in secu- 
rity level, merely an increase in the length of the hash-code. This gives a clear 
motivation to design a new hash function with variable length hash-code, which 
is both efficient and secure. 

Information security is becoming an inevitable part of our society, and there- 
fore information technology must provide services to fulfill the needs of various 
people. The need for variable length hash-code can be explained in an analogous 
way. Moreover, the ever-increasing nature of computing power will eventually 
threaten the length of the hash-codes used in many applications today. Instead 
of redesigning a new hash function in such events, the use of a single hash func- 
tion with a variable length hash-code seems to be a cost-effective and convenient 
way of increasing the security level. 



3 Terminology and Notations 

The use of hyte in this paper implies an 8-bit quantity, word implies a 32-bit 
quantity, and block implies a 1024-bit quantity, which is the input size of the 
compression function. We assume a byte with the most significant bit of each 
byte listed first and a block with the least significant byte of each block given 
first. Throughout this paper, the following notations will be used: 

— -I- : addition of words, i.e. addition by modulo-2^^. 

— : the circular left shift of A by s bit positions. 
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Table 1. Characteristics of HAS-V 



Length of Input Block (bits) 


1024 


Length of Output (bits) 


128 ~ 320 


Number of Rounds 


10 


Number of Chaining Variables 


10 


Number of Steps 


200 



Table 2. Initial values 



A 


B 


C 


D 


E 


67452301 


efcdab89 


98badcfe 


10325476 


c3d2el/0 


F 


G 


H 


I 


J 


8796a564 


465a6978 


0/le2d3c 


a0blc2d3 


68794e5/ 



— -1 : the bitwise complement operation. 

— V, A, © : the bitwise OR, AND, and XOR operation. (A A V is also denoted 
as XY for simplicity) 

4 Description of HAS-V Algorithm 

The basic structure of the compression function of HAS-V is two parallel lines, 
denoted as the X-line and the Y-line, consisting of 100 steps each. Each line 
is composed of 5 rounds, where each round consists of 20 steps, and maintains 

5 words of chaining variables, a total of 10 chaining variables for the entire 
compression function. The two parallel lines are interrelated by swapping the 
contents of the entire chaining variables in the X-line and the Y-line after each 
round. The message words used in the compression function are 32 words, or a 
1024 bit block, of the input message and 8 additionally generated words each 
round by message expansion, a total of 40 words for the entire compression func- 
tion. The characteristics of the structure of HAS-V are summarized in Table 1. 

Append Padding Bits and Length: The message is padded so that its 

length is congruent to 952 modulo 1024. Padding is performed by appending 
a single ”1” bit and necessary zero bits to satisfy the above constraints. The 
remaining 72 bits, in order to be a multiple of 1024 bits, is filled by appending 
the desired length of the hash-code represented in bytes and the length of the 
input coded in modulo 2®"^ represented in bits. 

Initial Value of the Chaining Variables: The initial values of the chaining 
variables used in HAS-V are given in Table 2. A, B, C, D, E are the chaining 
variables of the X-line and F, G, iJ, /, J are the chaining variables of the Y-line. 

Message Preparation and Expansion: The length of the input block used 
in each compression function is 1024 bits. The upper 512 bits consist of 16 words. 
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Table 3. Message combination to generate extra messages 



Index 


Round 1 


Round 2 


Round 3 


Round 4 


Round 5 


16 


0, 1, 2, 3 


3, 6, 9, 12 


12, 5, 14, 7 


7, 2, 13, 8 


15, 9, 5, 3 


17 


4, 5, 6, 7 


15, 2, 5, 8 


0, 9, 2, 11 


3, 14, 9, 4 


12, 8, 6, 2 


18 


8, 9, 10, 11 


11, 14, 1, 4 


4, 13, 6, 15 


15, 10, 5, 0 


13, 11, 7, 1 


19 


12, 13, 14, 15 


7, 10, 13, 0 


8, 1, 10, 3 


11, 6, 1, 12 


14, 10, 4, 0 



X[0], X[l], . . . , X[15] that are used in the X-line and the remaining lower 512 
bits consist of 16 words, h^[0], h^[l], . . . , ^[15] that are used in the Y-line. Each 
line of the compression function then additionally generates 4 message words in 
each round by message expansion. The extra messages are created by the XOR of 
4- word combinations chosen from its present line. The word combinations used 
in message expansion differ for every round of the compression function and are 
given in Table 3. 

For example, the message word X[17] used in round 2 of the X-line is gener- 
ated as follows: 

X[17] = X[15] © X[2] © X[5] © X[8]. 

The rest of the message words, X[16], Y[18], X[19], can be expanded in a similar 
way. The message words in the opposite Y-line, Y[16], Y[17], Y[18], Y[19] can 
be derived in an analogous way. 



Ordering of the Message Words: Each message word among the 20 message 
words, 16 initial input message words and 4 expanded message words, is applied 
to a single step in each line. The order of message words used in both lines is 
equivalent. The ordering of the message words for each round is given in Table 4. 
The extra messages generated by message expansion, Y[16], Y[17], Y[18], Y[19], 
are applied to steps 10, 15, 0, and 5, respectively, in each round. 



Step Operation: The operation in each step is equivalent in both the X-line 
and the Y-line. The step operation of the X-line is given below. 

T ^ A«" + f{B, C,D,E)+X + K, 

E ^ D ■, D^C ■ C ^ A - A^T. 

Here /, s, and K are the Boolean function, shift amount, and additive constant, 
respectively. The Boolean function and constant differ for every round of the 
compression function, whereas, the shift amount varies for every step within a 
single round of the compression function. 



Boolean Function: The following 5 Boolean functions are used in HAS-V. 



/o(x, y, z, u) = xy® ~^xz © yu © zu, 
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Table 4. Message ordering 



Step 


Round 1 


Round 2 


Round 3 


Round 4 


Round 5 


0 


18 


18 


18 


18 


18 


1 


0 


3 


12 


7 


15 


2 


1 


6 


5 


2 


9 


3 


2 


9 


14 


13 


5 


4 


3 


12 


7 


8 


3 


5 


19 


19 


19 


19 


19 


6 


4 


15 


0 


3 


12 


7 


5 


2 


9 


14 


8 


8 


6 


5 


2 


9 


6 


9 


7 


8 


11 


4 


2 


10 


16 


16 


16 


16 


16 


11 


8 


11 


4 


15 


13 


12 


9 


14 


13 


10 


11 


13 


10 


1 


6 


5 


7 


14 


11 


4 


15 


0 


1 


15 


17 


17 


17 


17 


17 


16 


12 


7 


8 


11 


14 


17 


13 


10 


1 


6 


10 


18 


14 


13 


10 


1 


4 


19 


15 


0 


3 


12 


0 



Table 5. Order of Boolean function 



Line 


Round 1 


Round 2 


Round 3 


Round 4 


Round 5 


X 


fo 


h 


/2 


h 


u 


Y 


U 


h 


h 


h 


fo 



fi{x,y,z,u) = XZ0J/0M, 

f 2 {x,y,z,u) = xy® -ixu © z, 

h{x,y,z,u) = x®yz®u{= fi{y,x, z,u)), 

f 4 {x, y, z, u) = ~^xy ®xz®yu® zu{= fo{x, z, y, u)). 



The Boolean functions are applied, in each line, as in Table 5 

Shifts: For both lines, the shift amount is given in Table 6. The period of the 
shift amount is 20 steps in the compression function. 

Constants: Additive constants are taken as the integer parts of the numbers 

given in Table 7. 

Swapping of the Chaining Variables: The contents of the chaining variables 
in the X-line and the Y-line are swapped after every round. 



A^F ; B^G ; C H ; D ^ I ; J 
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Table 6. Shift amount 



Step mod 20 


0 


1 


2 


3 


4 


5 


6 


7 


8 


9 


s 


5 


11 


7 


13 


15 


6 


13 


9 


5 


11 


Step mod 20 


10 


11 


12 


13 


14 


15 


16 


17 


18 


19 


s 


7 


12 


8 


15 


13 


8 


15 


6 


7 


14 



Table 7 . Constants 



Line 


Round 1 


Round 2 


Round 3 


Round 4 


Round 5 


X 


0 


5a827999 

[2®“v^J 


6ed9ebal 

L2^°V^J 


Sflbbcdc 

[2^°V5\ 


a953/d4e 

[2“V^J 


Y 


L2™V7J 




0 







Final Feedforward Process of Chaining Variables: Let us assume that 

the contents of the chaining variables before the compression function are A ^ J, 
and let the contents of the chaining variables after the compression function be 
AA ^ J J . Then the updated contents of the chaining variables, or the output 
of the compression function, are given as shown. 

A+ = AA, B+ = BB, C+ = CC, D+ = DD, E+ = EE, 

E+ = EE, G+ = GG, H+ = HH, 1+ = II, J+ = JJ. 



Output Tailoring: In the case of 320-bit hash-code, the output is given as the 
contents of the 10 chaining variables concatenated, i.e. A||i?||C'||Z?||if||F||G||iL 
||/||J. Otherwise, when the length of the hash-code is required to be shorter 
than 320 bits, it must be tailored into a string of specified length, denoted as 
Oo||Oi|| . . . \\Ot{t = 3,4, ... ,8). The contents of Oi differ in each case of various 
lengths of hash-codes. Let us denote a t-bit string as to explicitly indicate 
the length of X. 

— Case 1 (128-bit hash-code): The 32-bit chaining variables E and J can be 
divided as follows: 

E = J = 

Oi is calculated as follows: 

= A + F + Oi = B + G + 4^®i, 

02 = G + H+ j[^^\ Os = D + I + 

— Case 2 (160-bit hash-code): Oi is calculated as follow. 

Go = A F, Oi = B G, O 2 = G -\- H, O3 = E 1, O4 = E J- 

— Case 3 (192-bit hash-code): The 32-bit chaining variables D, E, I, and J 
can be divided as follows: 

D = E = 

r _ r[ll] 7-[ll] r ( 10 ] r _ j[H] j[H] t [ 10 ] 

1 — i 2 -^1 -^0 ’ ^ — ^2 ' 
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Oi is calculated as follows: 



Oo= A+(4“'ii4“'), Oi 

02= c + O3 

04 = G + ( 4 “'||i?r'), O 5 






— Case 4 (224-bit hash-code): The 32-bit chaining variables E, I, and J can 
be divided as follows: 



E = i = J = jf^ J q^ ■ 



Oi is calculated as follows: 

Oo = A + (4«]||/f ), Oi = S + (4®1||481), 
o, = G+ (4®i||/^4 03 = 0 + (4®]||j^]), 

Oi = E + 4“’, 05 = 0 + Ef^\ Oe = H + E^^^l 

— Case 5 (256-bit hash-code): The 32-bit chaining variables E and J can be 
divided as follows: 



p rp[^] pis] rp[^] pis] T t[8] t[S] t[8] t[^ 

tj — h,2 -tji -Cjq , J — J2 Uq 



[ 8 ] 



Oi is calculated as follows: 



Oq — A + 4 
02 = 0+4®', 
0^ = F + Ef\ 
OQ = H + Ef\ 



Oi = 5 + 4 ®’, 

03 = 0 + 4 ®', 
05 = 0 + 5'®', 

Or= / + 5'®'. 



— Case 6 (288-bit hash-code): The 32-bit chaining variable J can be divided as 
follows: 

r _ Tin Tin M r[6] d6] 

U — U4 U3 ^2 '^0 ■ 



Oi is calculated as follow. 



Oo = A+jP, Ot = B + 4\ 

02 = 0+ jP , 03 = 0 + 4®’, O 4 = 5 + 4®’, 
O 5 = 5, Oe = O, O 7 = H, Os = I. 



5 Design Rationales and Secnrity Aspects 

In this section, we discuss the underlying principles that were considered in 
the process of designing HAS-V. A great deal of attention was paid to balance 
the characteristics of security and performance. Security matters are considered 
based on previous attacks on hash functions and employ firm design philosophies 
of the previous hash functions. Performance matters are considered in the area 
of hardware support and algorithmic parallelism. 
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Number of Chaining Variables in Step Operation: One of the big dif- 

ferences between CISC processors, including the Intel 80x86 family and the Mo- 
torola 680x0, and RISC processors, including SPARC, MIPS, PA-RISC, Pow- 
erPC, and Alpha, is the number of on-chip general-purpose registers [4]. Gen- 
erally RISC processors have enough general-purpose registers to load all the 
chaining variables on the register. However, due to their complex instruction 
set, CISC processors usually suffer from a shortage of general-purpose registers. 
In the case of a Pentium processor, there are only 7 general-purpose registers 
on its chip. Assuming at least 1 register is needed for temporary storage, no 
more than 6 registers can be used in an iterative step operation to perform with- 
out deterioration. HAVAL uses 8 chaining variables in its step operation and 
could suffer from performance deterioration in CISC processors. Therefore, in 
the design of HAS-V, a twin structure was employed and the number of chaining 
variables used in the step operation was chosen to be 5, so that the entire set of 
chaining variables could be loaded on the processor during the iterative process. 
It may seem at first that the two lines should be processed simultaneously since 
the chaining variables are swapped after every round. However, the two lines of 
the compression function can be processed independently as the entire chaining 
variables are swapped instead of just a portion of it. In an implementation point 
of view, the chaining variables are not actually swapped. Instead, the message 
words. Boolean function, and constants used in the step operation of round 2 
and round 4 are replaced by the co rresponding ones of the opposite line. This 
can be better understood by referring to the pseudo-code in Appendix A. 

Process of Message Words: Early attacks on MD4 and MD5 were based 

on the weakness of the rather straightforward usage of the message words. An 
attack on the last two rounds of MD4 [1] and the cryptanalysis of MD4 [7] fall 
into this category of attack. A single message word of the input is only used 
once in every round of MD4 and MD5. This seems to provide vulnerability for 
inner collisions. A concept of message expansion was introduced in SHA-1, which 
provided a concrete security level against these sorts of attacks. This attractive 
property of message expansion was employed in the design of HAS-V. However, 
the generation of 64 message words from 16 message words of input seemed to 
load a heavy burden on the performance of SHA-1. In HAS-V, we have generated 
20 message words from 16 message words for each line. This allows HAS-V to 
stay within a fairly good performance range, while providing enough diffusion 
from a single message word. 



Step Operation: The structure of the step operation is an important factor 

in determining the performance factor. It is known that the step operation used 
in SHA-1 possesses a natural algorithmic parallelism in its compression function 
[4] . This arises from the fact that the updated chaining variable is not used in the 
Boolean function of the next step operation, which has the effect of reducing the 
critical path length. This mechanism has been employed in the step operation 
of HAS-V to provide a further advantage in performance. Other characteristics 
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in the step operation of HAS-V are the use of 4- variable Boolean functions and 
the use of a shift amount, which varies for each step within a single round of 
the compression function. The variable shift amount seems to provide better 
immunity against attacks such as differential collision in SHA-0 [5]. The gener- 
alization of inner collisions to a full compression function seemed to be harder 
with variable shift amounts. 



Boolean Function: Previous attacks such as the collision attack on the com- 
pression function of MD5 [2] and differential attacks uses the linear approxima- 
tion of Boolean functions. HAS-V employs 4- variable Boolean functions, whereas 
most MD4-family hash functions employ 3-variable Boolean functions. Having 
an extra variable in the Boolean function increases the complexity of a linear 
approximation and the computational cost of the Boolean function. Therefore, 
it is important to keep the balance between the needs for non-linearity and the 
loss of computational efficiency, while constructing a Boolean function. The com- 
putation of Boolean functions in HAS-V require about 3 or 4 unit operations^, 
whereas the 3-variable Boolean functions used in other hash functions require 
about 2 or 3 unit operations. Therefore, by using 4-variable Boolean functions 
and omitting a single addition in the step function, we can improve the security 
aspects of HAS-V without performance deterioration. Among the numerous 4- 
variable Boolean functions, we have selected ones that are 0-1 balanced, satisfy 
SAC, and have a high non-linearity [16,17] to be used in HAS-V. 



Output Tailoring: The output of HAS-V must provide a variable output 

from 128 bits to 320 bits, incrementing in multiples of 32 bits. We have modified 
the output tailoring method of HAVAL to produce the desired length of output, 
while providing a fair share to all of the chaining variables. Moreover, this process 
must not put unnecessary burden on the overall workload to deteriorate the 
performance. 



Endianness: As with most of the MD4-family hash functions, the newly pro- 
posed hash function is optimized for 32-bit architecture processors. HAS-V fa- 
vors ’little-endian’ architectures. Processors with ’big-endian’ architectures have 
to byte-reverse each word before processing, and since the big-endian processors 
are generally faster, it was decided to let them do the reversing it. This incurs a 
performance penalty of about 25%. 

^ The number of unit operations for Boolean fnnctions can be defined by the least 
nnmber of bit-wise operations such as A, V, or ©, which are required to compute 
the Boolean functions. It can be regarded as a performance measure. If we modify 
the Boolean functions /o, fi, and / 2 , of HAS-V, the number of unit operations can be 
found. For example, since the truth table of /o is equivalent to that of {x(Bu){y(Bz)®z, 
the number of unit operations of /o is less than 4. In a similar way, those of /i and 
fz are 3 and 4, respectively 
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Table 8. Comparison of speed performance 



Algorithm 


Performance(Mbits / sec) 


Pentium III (600MHz) 


Ultra 2 SPARC (300MHz) 


Microsoft Visual C-|— 1- 


GNU C 


MD5 


41.95 


22.04 


SHA-1 


22.58 


10.54 


RIPEMD-160 


19.38 


8.94 


HAVAL(5 PASS) 


20.64 


11.01 


HAS-V 


27.86 


12.58 



6 Performance Evaluation 

In this section we compare the performance of MD5, SHA-1, RIPEMD-160, 
HAVAL, and HAS-V. Output tailoring was ignored in both HAVAL and HAS- 
V, since it only occupies a negligible amount of time. Implementations were 
written in the C language and there was no optimization done in any way. 
The implementation was done solely for comparative reasons. The performance 
results were extracted by hashing 64Mbytes of data using an 8Kbyte buffer. 
Table 8 shows the results of our experiment. The results show that HAS-V has 
better performance than SHA-1, RIPEMD-160, or HAVAL with 5 passes in both 
environments. 

The step operation of HAS-V consists of 3 additions, 2 circular shifts, and a 
Boolean function. Since the Boolean function consists of 4 unit operations, a sin- 
gle step operation will consist of 9 unit operations, assuming both addition and 
circular shift to be equivalent to unit operation. The total number of unit op- 
erations for generating the extra messages is 2(lines) x5(rounds) x4(messages) x 
3(unit operations) = 120(unit operations). Therefore the number of unit 
opeations to hash 1024 bit block is given below. 

l(block) X 200(steps) x 9(step operation) -|- 120(message expansion) 

= 1920 (unit operations) 

In the case of RIPEMD-160, the total number of unit operation to hash 1024 
bit block is given below. 

2 (block) X 160 (steps) x 9 (step operation) 

= 2880 (unit operations) 

This is about 33% more operation than HAS-V. This fact can also be seen in 
Table 8 where HAS-V is 31% faster than RIPEMD-160 on a Pentium PC. 

7 Conclusion 

We have proposed a new hash function with a variable length hash-code, namely 
HAS-V. The design was made such that it is both secure and efficient in most 
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computing environments. We expect that our results will broaden the use of 
KCDSA or any other cryptographic application that uses hash functions. We 
believe that the variable nature of the hash-code length will anticipate the needs 
of various practical applications. 
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A Pseudo-Code of HAS-V 



A.l Definition 



Let the input message string be consisted of t 1024-bit blocks, represented as 
{X,[j],Y,[j]}{0<i<t,0<j<l6). 



= xy (B ~'xz (B yu (B zu {0 < j < 19) 
= xz (B y (B u (20 < j < 39) 

= xy (B ~'xu © z (40 < j < 59) 

= X (B yz (B u (60 < j < 79) 

= -'xy (B xz (B yu (B zu (80 < j < 99) 
= -<xy (B xz (B yu (B zu (0 < j < 19) 
= x ® yz (B u (20 < j < 39) 

= xy (B -'XU © z (40 < j < 59) 

= xz (B y ® u (60 < j < 79) 

= xy (B ~'xz (B yu(B zu (80 < j < 99) 



fj{x,y,z,u) 
fj{x,y,z,u) 
fj{x,y,z,u) 
fj{x,y,z,u) 
fj{x,y,z,u) 
9]{x,y,z,u) 
9 j{x,y,z,u) 
9 j{x,y,z,u) 
9 j{x,y,z,u) 
9 j{x,y,z,u) 

Kj = 00000000 

Kj = 5a827999 
Kj = 6ed9ebal 
Kj = Sflbbcdc 
Kj = o953/d4e 



AT' = o953/d4e 
Kj = Sflbbcdc 

K'j = 00000000 

K'j = 5a827999 
K'j = 6ed9e6al 



(0 < j < 19) 
(20 <j< 39) 
(40 <j< 59) 
(60 <j< 79) 
(80 <j< 99) 



s(j) = 5,11,7,13,15,6,13,9,5,11,7,12,8,15,13,8,15,6,7,14 



m{j) = 18, 0, 1, 2, 3, 19, 4, 5, 6, 7, 16, 8, 9, 10, 11, 17, 12, 13, 14, 15 (0 < j < 19) 
m\j) = 18, 3, 6, 9, 12, 19, 15, 2, 5, 8, 16, 11, 14, 1, 4, 17, 7, 10, 13, 0 (20 <j< 39) 
m\j) = 18, 12, 5, 14, 7, 19, 0, 9, 2, 11, 16, 4, 13, 6, 15, 17, 8, 1, 10, 3 (40 <j< 59) 
m\j) = 18, 7, 2, 13, 8, 19, 3, 14, 9, 4, 16, 15, 10, 5, 0, 17, 11, 6, 1, 12 (60 <j< 79) 
m\j) = 18, 15, 9, 5, 3, 19, 12, 8, 6, 2, 16, 13, 11, 7, 1, 17, 14, 10, 4, 0 (80 <j< 99) 
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aj{k) = 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15 (j = 0) 
aj\k) = 3,6,9,12,15,2,5,8,11,14,1,4,7,10,13,0 (j = 20) 
aj{k) = 12,5,14,7,0,9,2,11,4,13,6,15,8,1,10,3 (j = 40) 
a,\k) = 7,2,13,8,3,14,9,4,15,10,5,0,11,6,1,12 (j = 60) 
a, (j) = 15, 9, 5, 3, 12, 8, 6, 2, 13, 11, 7, 1, 14, 10, 4, 0 (j = 80) 



A. 2 Pseudo-Code 

Initial values of the chaining variables 

ho = 67452301; hi = efcdabSQ; = 98badcfe; ho = 10325476; /14 = c3d2el/0; 

ho = 8796a564; he = 465a6978; /17 = 0/le2d3c; /ig = a061c2d3; hg = 68794e5/; 

for z = 0, . . . , t — 1{ 

A = hg] Id — hi] C = hg] D = hg] E = h/i] 

E — hg^ G — he-j H — hj^ I — ^s, J — 
for j = 0 to 99{ 
if(j = 0,40,80){ 
for fc = 0 to 3{ 

A, [16 + k] = Xi[aj{4k)] © X,[aj{4k + 1)] © X,[aj{4k + 2)] 
®Xi[aj{4:k + 3)]; 

} 

} 

else if(j = 20, 60) { 
for fc = 0 to 3{ 

yjl6 + k] = Y,[aj{4k)] © Y,[aj{4k + 1)] © Y,[aj{4k + 2)] 
®Yi[aj{4:k + 3)]; 

} 

} 

if{round = 1, 3, 5){ 

T = c, D, E) + X,[m{j)] + A,; 

} 

else \i{round = 2,4){ 

T = c, D, E) + r,[m(j)] + A'; 

} 

E = D]D = C]C = B«30; B = A]A = T] 

} 

for j = 0 to 99{ 
if(j = 0,40,80){ 
for fc = 0 to 3{ 

y,[16 + k] = Y,[aj{4k)] © Y,[aj{4k + 1)] © Y,[aj{4k + 2)]; 
®Yi\oj{4k + 3)]; 

} 

} 

else if(j = 20, 60) { 
for fc = 0 to 3{ 

A, [16 + k]= Xi[aj{4k)] © Xi[aj{4k + 1)] © X,[aj{4k + 2)] 
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(BXi[aj{4k + 3)]; 

} 

} 

\i{round = 1, 3, 5){ 

T = + g^^G, H, I, J) + Y,[m{j)] + Kr- 

] 

else \i{round = 2,4){ 

T = H, J, J) + X,[m{j)] + K,; 

} 

J = I;I = H;H = G«30; G = F;F = T; 

} 

HqF=F] hiF=G', h2F=H] h^F=I', h/^F=J ; 

/i5+=A; h^F=B] h’jF=C] hg,F=D] hgF=E\ 



B Test Values of HAS-V 

The test values of HAS-V are given in the case of 320-hit hash-code with no 
output tailoring 

HAS-V("")=475974be d7eal37d 982dldf5 b2583bla c4d5941d 
8d557bb3 03586742 d8891788 943a9668 a9da68c3 
HAS-V ("abc")=a70ab818 294865cf 9c9697d6 97152353 70381583 
3f8fla42 e0150588 8b002e43 05fe6405 519f595c 
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Abstract. Welch-Gong (WG) transformation sequences are binary se- 
quences of period 2" — 1 with 2-level auto correlation. These sequences 
were discovered by Golomb, Gong and Gaal in 1998 and verified for 
5 < n < 20. Later on. No, Chung and Yun found another way to con- 
struct the WG sequences and verified their result for 5 < n < 23. Dillon 
first proved this result for odd n in 1998, and finally, Dobbertiir and Dil- 
lon proved it for even n in 1999. In this paper, we investigate a two-faced 
property of the WG transformation sequences for application in stream 
ciphers and pseudo-random number generators. One is to present ran- 
domness or unpredictability of the WG transformation sequences. The 
other is to exhibit the security property of the WG transformations re- 
garded as Boolean functions. It is shown that the WG transformation 
sequences, in addition to the known 2-level auto correlation, have three- 
level cross correlation with m-sequences, large linear ! span increasing 
exponentially with n and efficient implementation. Thus this is the first 
type of pseudo-random sequences with good correlation and statistic 
properties, large linear span and efficient implementation. When the WG 
transformation are regarded as Boolean functions, it is proved that they 
have high nonlinearity. A criterion for whether the WG transformations 
regarded as Boolean functions are r-resilient is derived. It is shown that 
the WG transformations regarded as Boolean functions have large linear 
span (this concept will be defined in this paper) and high degree. 

Key words: Stream cipher, pseudo-random sequence (number) genera- 
tor, auto/cross correlation, linear span. Boolean function, non-linearity, 
r-resilient property. 



1 Introduction 

Pseudo-random sequences have been widely used in communications and cryp- 
tology. In order to guarantee that the pseudo-random sequence generators have 
good randomness or unpredictability, we have the following criteria: 

— Long period 

— Balance property (Golomb Postulate 1 [5]) 
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— Run property (Golomb Postulate 2) 

— n-tuple distribution 

— Two-level auto correlation (Golomb Postulate 3) 

— Low-level cross correlation 

— Large linear span and smooth increased linear span profiles 

In the last three years, the study of binary sequences with 2-level autocorre- 
lation has made significant progress. The researchers [4,6,12,14,16] have found a 
number of new classes of binary sequences with 2-level auto correlation. In gen- 
eral, a pseudo-random sequence generator which generates sequences with 2-level 
auto correlation can be resistant to a correlation attack. However it is not easy to 
design a pseudo-random sequence generator which can generate sequences having 
both 2-level auto correlation and large linear span and be efficient to implement 
as well. Fortunately, it happened that one of classes of new sequences with 2-level 
auto correlation, so-called the Welch-Gong transformation sequences, possesses 
all these three properties. On the other hand, this type of sequences has period 
2" — 1. Any binary sequence of period 2” — 1 is related to a function from the 
finite field GF{2'^) to the finite field GF(2). Thus it is automatically related 
to a Boolean function in n variables. Thus there is a connection among binary 
sequences with period 2" — 1, polynomial functions from GF(2") to GF{2) and 
Boolean functions in n variables. Ghang, Dai and Gong [1] tried to use this 
connection. I.e., they applied m-sequences with three-level cross correlation to 
construct Boolean functions with the maximal non-linearity. In [10], Gong and 
Golomb successfully tried again to utilize this connection. They applied tools 
in pseudo-random sequence design and analysis to analyze the S-boxes in DES 
(Data Encryption Standard). When they considered the relationship between 
sequences and functions, they realized that monomials, which correspond to m- 
sequences, are not secure when used as component functions in block ciphers. 
This leads to a concept of linear span for polynomial functions introduced in 
their recent work [10]. In this paper, we will investigate the Welch-Gong trans- 
formation sequences in a two-faced aspect. One is to present their randomness, 
i.e., auto correlation, cross correlation with m-sequences, the balance property, 
and linear span when we consider them as sequences. The other is to derive 
the nonlinearity, the resilient property, linear span and degree when they are 
regarded as Boolean functions. 

This paper is organized as follows. In Section 2, we give the definition of 
Welch-Gong transformation sequences. In Section 3, we present the randomness 
properties of the Welch-Gong transformation sequences which include an irregu- 
lar decimation property, statistic properties, cross correlation with m-sequences, 
the Hadamard transform, and the linear span. In Section 4, we derive the non- 
linearity and a criterion for the resilient property (Note. Since any Welch-Gong 
transformation is balanced, so the correlation immunity property becomes the 
resilient property). In Section 5, we discuss linear span for the Welch-Gong trans- 
formations regarded as Boolean functions and show their degrees. Section 6 is a 
conclusion. All proofs omitted from this extended abstract can be found in the 
full paper [7]. 
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We conclude this section by providing some preliminaries for sequence de- 
signs. The reader is referred to [5] for shift register sequences, [11] for the theory 
of finite field, and [19] and [18] for the motivation and the original definitions of 
nonlinearity and the resilient property for Boolean functions. 

We will use the following notation throughout the paper: 

— Fq = GF{q), a finite field with q elements, and F*, the multiplication group 
of F,. 

— F 2 = {x = (a:o, xi, • • • , x„_i)|xi G F 2 }, a vector space over F 2 of dimension 
n. 

— a = {oi}, a sequence over F 2 , i.e., G F 2 , is called a binary sequence. If a is a 
periodic sequence with period n, then we also denote a = (oq, oi, • • • , 

an element in F|. 

A. Autocorrelation 

If a = (tto, oi, • • • , Op-i) is a binary sequence with period p, its (periodic) auto- 
correlation function C(r) is defined as 

p-i 

i=0 

Here r is a phase shift of the sequence {a^}. 

Ideal (2-Level) Autocorrelation: If 

/ P if X = 0 mod p, 

^ otherwise, 

then we say that the sequence { 0 ^} has the idea two- level autocorrelation func- 
tion. 

B. Cross Correlation 

If a = (ao,ai,--- ,ap_i) and b = (6q,6i,-- - ,&p-i) are two binary sequences 
with period p, their (periodic) cross correlation function Ca,b(x) is defined as 

p-i 

C'a,b(r) = ^(-l)“-+-+'Sr = 0,l,--- . 
i=0 

Here r is a phase shift of the sequence {6^}. 

C. Hamming Weight 

Let H{s) = [{0 < z < 2" — l|si = 1}[ if s = {si} is a binary sequence with period 
2” — 1 and H{s) = |{x G F 2 n|s(x) = 1}[ if s = s(x) is a polynomial function 
from F 2 " to F 2 . In both cases, we call H[s) the Hamming weight of s. For a 

positive integer r = ro-l-ri2-| r„_i2"'“^,ri G F 2 , H{r) = ]{0 < z < n\n = 1}[ 

is also called the Hamming weight of the integer r. 
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2 Definition of Welch-Gong Transformation Seqnence 
Generators 

In this section, we will give the definition of the Welch-Gong transformation 
sequence generators. Hence after, we set n^O mod 3. 

Let g{x) = X + -|- a;® -|- a:'^'* , x £ 

91 = 2'= + 1, 

92 = 22 '=-! + 2 '=-! 

93 = 22 '=-! - 2 '=-! 

94 = 22 '=-! + 2 ^= - 

where n = 3fc — 1 and 

91 = 2 '=-! + 1 , 

92 = 22^-2 2 '=-^ 

93 = 22^-2 - 2 '=-^ 

94 = 22^-1 - 2 '=-! 

where n = 3fc — 2. Then a function, say f{x) 

/(a:) = Tr( 9 (a;+ 1) + l),a: £ F 2 ~ (3) 

is called the Welch-Gong transformation of Tr{g(x)), or the WG transformation 
for short. 

Let a be a primitive element of F 2 »*. Let a = {oi} and b = {6^} whose 
elements are given by 

Ui = Tr{g{W)),h = f{W) = Tr{g{W + 1) + 1), i = 0, 1, • • • . (4) 

Then b is called a Welch-Gong transformation sequence of a, or WG sequence 
for short. 

Any function from F 2 " to F 2 is related to a Boolean function (we will discuss 
an exact conversion of these two representations in Section 4). From a WG 
transformation, we have two types of pseudo-random sequence generators. One 
is WG sequences themselves. The other is to apply WG transformations regarded 
as Boolean functions to operate on a set of LFSRs for generating sequences. I.e., 
applying the WG transformations regarded as Boolean functions as combining 
functions or filtering functions in combinatorial function generators or filtering 
generators [17]. We refer to these two modes as WG sequence generators. 

Remark 1. In fact, the transform given in (3) can be applied to any function from 
F 2 " to F 2 . But till now, we haven’t found another type of g{x) such that its WG 
transformation sequence has 2-level auto correlation. So, we restrict ourselves to 
this specific g{x). 



F2f>, where qi are defined by 



+ 1 ; 

+ 1 and 

L 



( 1 ) 



+ 1; 

+ 1 and 
+ 1; 



(2) 



), from F2n to F2 defined by 
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3 Randomness of WG Sequences 

In this section, we will discuss the randomness properties of WG sequences, 
including their decimation, auto/cross correlation, statistic properties, and linear 
span. 

A. Decimation Property 

Lemma 1. Let a he a primitive element o/F 2 " and b be the WG sequence of 
a. Then elements of b can he obtained by operating an irregular decimation on 
a as follows: bo = ag for i > 0, 



Proof. From the definition, ag = Tr(g(l)) = n and hg = Tr{g{0) + 1) = n. Thus 
ag = bg. For i > 0, if n is even, we have 



Thus the assertion is established. 

Remark 2. This lemma shows that the WG sequence b can be obtained by an 
irregular decimation from a where the decimation is determined by (5). Note 
that a is a 5-term sequence and it can be generated by using five linear feedback 
shift registers and one AND gate. This property of the WG sequences allows 
them to have an efficient implementation for small n by operating decimation 
on a together with a table look-up. 

B. Auto Correlation, 2- Tuple Distribution, and Balance Property 

Proposition 1. Let h be a WG sequence defined by (4-). Then b is a binary 
sequence of period 2" — 1 with (ideal) 2-level auto correlation. 

The proof of this proposition can be found in [4] . 

Remark 3. This result was first discovered by Golomb, Gong and Gaal in [12] 
and verified for 5 < n < 20. Later on. No et al [16], found another way to 
construct the WG sequences and verified their result for 5 < n < 23. Dillon 
proved it for the odd n case [2], and finally, Dobbertin and Dillon proved it for 
the even n case [4] which completely established Proposition 1. 




flT(i) even, 

aT(i) -1-1 if n odd. 



where t{i) is determined by 



= W + l. 



(5) 



bi = Tr{g{W -|- 1) -|- 1) = Tr{g{W -|- 1)) -|- n 
= Tr{g{W -k 1)) = ar(i),i = 0, 1, - • • . 



(6) 



Similarly, if n is odd, we have 



bi = Tr{g{W -k 1) -k 1) = Tr{g{W -k 1)) -k n 
= Tr\g\w -k 1)) = a^(i) -k 1, i = 0, 1, • • • . 



(7) 
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Note, a is also a 2-level auto correlation sequence. See [12,4]. 

Let s be a binary sequence of period 2" — 1. Assume that 1 < t < n. In every 
period of s, if each nonzero t-tuple (ci,C 2 ,--- ,Ct) G occurs 2”“* times and 
zero f-tuple (0, • • • ,0) occurs 2”“‘ — 1 times, then we say that the sequence has 
an ideal t-tuple distribution. 



Proposition 2. Any WG sequence has ideal 2-tuple distribution and it is bal- 
anced (i.e., in every period 2" — 1, zeros occur 2"“^ — 1 times and ones occur 
2”“^ times). 



Proof. Let b be a WG sequence of period 2" — 1. Since b is a 2-level auto 
correlation sequence, then for any shift of b, say L’^(b) = (6i-, 6r+i, • ‘ 
have 



\{0<k<2^-l\{bk,bk+r) = {i,j)}\ 



2"-2 _ 1 if J = J = 0, 
2"“^ otherwise. 



(8) 



where i,j € F 2 . When r = 1 it establishes the first assertion. The second result 
follows from the first one. (For a detail proof, please see the full paper.) 



C. Hadamard Transform and Cross Correlation 

Let f{x) be a function from F 2 »* to F 2 . Then the Hadamard transform of f{x) 
is defined by 



/(A) = (g) 

xGF2n 



Property 1. Let Sy{x) = Tr{x'"). Let n = 2m -|- 1 be odd and u = 2* -|- 1 with 
gcd{t,n) = 1. Then the Hadamard transform of S' 2 t+i(a;) is given by 



52* - 1-1 (A) 



0 if Tr{\) = 0, 
±2'"+! if Tr(A) = 1. 



This result is established by Gold [8] in 1968. 

Theorem 1. For odd n, let g(x) be defined as in Section 2.1, and let /(x) be the 
Welch-Gong transform of Tr{g{x)), i.e., f{x) = Tr{g{x-\- 1) -I- 1). Then /(A), 
the Hadamard transform of f{x), is given by 

/(A) = 4*+i(A'=), (10) 



where 



c = d ^ and d = 2^‘ — 2* -|- 1 for 3t = 1 mod n. (11) 

Proof. Here we only give a link with the source for our proof. The linear span 
of WG sequences are give in [12] by Golomb, Gong and Gaal. Later on. No, 
Ghung and Yin found another way to generate the WG sequences in [16] which 
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is verified by experimental results. A few months later, Dobbertin proved that 
the sequences generated in [16] are WG sequences in [3] through these two types 
of sequences having the same linear span. Dillon proved that the sequences gen- 
erated by No-Chung-Yin have 2-level autocorrelation for n odd by establishing 
the Hadamard transform of an No-Chung-Yin sequence is equal to the value of 
the right hand of equation (10). Thus the result follows from the above link. 

From Theorem 1 and Property 1, we now have the following corollaries. 

Corollary 1 (Function Version). Let n = 2m -|- 1 odd, and let f{x) be the 
Weleh-Gong transformation function. Then the Hadamard transform of f{x) is 
given by 



where c is defined in (11) in Theorem 1 . 

Proof. Applying Property 1 to Theorem 1, the result follows. 

Corollary 2 (Sequence Version). Let a be a primitive element of ¥ 2 ^ and 
f{x) be a Weleh-Gong transformation function. Let a = {ai} be an m-sequence 
whose elements are defined by 



andh — {6^} be the WG sequence whose elements are given by ()). Then Ca,b(T), 
the cross correlation function between a and b, is determined by 



where c is defined in (11) in Theorem 1. L.e., the cross correlation function 
between a and b are three-valued. 

D. Linear Span of WG Sequences 

The linear span of a sequence is defined to be the shortest length of the 
linear feedback shift registers which generate the sequence. Sequence with large 
linear span are resistant to attacks arising from employing the Berlekamp-Massey 
algorithm [15]. 

Proposition 3. Let b 6e a WG sequence of period 2" — 1 and LS{W) represent 
its linear span. Then 



A proof for this result is given in Section 5. From Proposition 3, it is clear 
that the linear span of the WG sequences of period 2" — 1 increases exponentially 
with n. 

Remark ). WG sequences of period 2" — 1 are the first type of binary sequences 
of period 2” — 1 which have the balance property, ideal 2-tuple distribution, 2- 
level auto correlation, three-level cross correlation with m-sequences, linear span 
increased exponentially in n. 




= Tr{W),i = 0, 1, • • • 




^^(y = n(2r"/31 -3). 
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4 Non-linearity and Resilient Property 
of Welch-Gong Transformations 

In this section, we will derive the non-linearity and the resilient property for 
the WG transformations when they are regarded as Boolean functions. First, we 
need to develop a result on the conversion of a polynomial function (from F2r» 
to F2) to a Boolean function in n variables. 

4.1 Isomorphism between F 2 ™ and FJ 

Since the finite field F2r. can be regarded as a vector space of n dimension, then 
we have a linear space structure for F2" . Let B = {oq, oi, • • • , On-i} be a basis 
of F2" over F2, then \/x € F2»>, we have 



X = xoao + xioi -I h x„-ian-i,Xi G F2. 

Let 

(5 : X X = (xo,xi, • • • ,x„_i) (12) 

then 5(x) is an isomorphism between F2« and F2 when both of them are regarded 
as vector spaces. 

Let /(x) be a function from F2" to F2. Then /(x) defines a Boolean function 
in the following way: 

/(a;) = fixoao -\ h x„_ia„_i) = /b(xq, xi, • • • , x„_i) 

Note that / and fs in the above identity might not be same. We will write 
/b(xo,Xi,--- ,x„_i) = /(xqjXi,--- ,Xn-i) for short if it will not cause any 
confusion in the context. 

Remark 5 . For a given Boolean function in n variables, we can obtain its poly- 
nomial representation which is a function from F2« to F2 by using so-called the 
Fourier transform, see [10]. 

4.2 Non-linearity of WG Transformations 

Let /(x), X = (xqjXi,--- ,Xn-i) G Fijn, be a Boolean function. A Boolean 
function a(x) is said to be affine if 

n— 1 

a(x) = ^ WiXi -I- c, rcj G F2, c G F2. 

i=0 



Let 



A = WiXi -I- c I Wi G F2, c G F2}. 
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I.e., A is the set consisting of all affine Boolean functions in n variables. Then 
the non-linearity of /(x), denoted by Nf, is defined as 

Nf = minaeAdif, a) 

where d{f,a) = |{x G F^|/(x) ^ a(x)}| which is called a distance of / from a. 

Remark 6. From the definition of the distance, we have d{f,a) = H{f + a), 
where H{f + a) is the Hamming weight of / -I- a defined in Section 1. 

Note that 



d{f,a) = \{x G F 2 n|/(a:) ^ a(a;)}| 

where /(i) and a(x) are polynomial forms of the Boolean functions /(x) and 
a(x) respectively. I.e., the distance of / from a is not changed whenever either 
a Boolean form or a polynomial form are applied. 

Let 

/(w) = E (13) 

xGFJ 



which is called the Walsh transform of the Boolean function / in the literature. 

Theorem 2. Let n = 2m + 1 odd. Let f{x) he the Welch-Gong transformation 
defined by (3). Let /(x) he the Boolean form of f{x). Then the non-linearity of 
/(x), denoted hy Nf, is given by 

Nf = 2 "-^ - 2 "". 



In order to prove Theorem 2, we need the following two lemmas whose proofs 
can be found in the full paper. 



Lemma 2. Let /(x) be a Boolean function and a(x) be a linear Boolean func- 
tion. Let f{x) and a(x) be a polynomial representation of /(x) and a(x) respec- 
tively. 

1. a(x) = J2i^iXi = w-x where w = {wq,wi, ■ ■ ■ ,Wn-i) G F^. Moreover, 
there exists some A G F 2 ™ such that a(x) = Tr{\x). 

2. The Hadamard transform of f{x) and the Walsh transform of f{x) have the 
following relation: 

/(w) = /(A),wGF^,AGF2n (14) 



where w • x = Tr{\x). 

3. The Hadamard transform of f and the distance of f{x) from a(x) are related 
by 



/(A) = 2"-2d(/,a) 



where A is the same as above. Or equivalently. 



2- - /(A) 



d(f,a) 



2 
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Lemma 3. Let /(x) be a funetion from F 2 to F 2 . Let a(x) he a linear funetion 
from F 2 to F 2 . Then d{f, a + 1) = 2" — d{f, a). Moreover d{f, a + 1) = ^ 
where A satisfies that a(x) = Tr(Xx). 

Proof of Theorem 2. Applying Lemma 2-(3), Corollary 1, and Lemma 3, the 
assertion follows. (A detailed proof is included in the full paper.) 

Remark 7. Chang, Dai and Gong [1] discussed how to construct Boolean func- 
tions with the maximal non-linearity in terms of binary m-sequences with three- 
valued cross correlation. In particular, they discussed the case Tr(x’') for some 
special choices of r. Gong and Golomb [10], pointed out that the monomial 
functions Tr{x^) are not secure when used as combining functions or filtering 
functions in stream cipher systems or block cipher modes, because they corre- 
spond to m-sequences. However, the WG transformations are not monomials. 
We will come back to this question in the next section. 

4.3 The Resilient Property of WG Transformations 

Let /(x),x G F 2 , be a Boolean function. For r > 0, /(x) is said to be r-order 
correlation immune if 

/(w) = 0 for all w G F 2 : 1 < R(w) < r. (15) 

This definition comes from the result obtained by Xiao and Massey [20] which 
is equivalent to Siegenthaler’s original definition [18]. If /(x) satisfies (15) and 
/(x) is balanced, i.e., iL(/(x)) = 2"“^. Then /(x) is said to be r-resilient. 

Let 



D = {x &¥* 2 n\Tr{x^) = 0} (16) 

where c is defined in Theorem 1. Then \D\ = 2"“^ — 1. Recall that {ao,Oi, 
• • • , On-i} is the basis of F 2 " over F 2 . Let 

i?={(Tr(Aoo),--- ,Tr(Aa„_i))GF^|AGD}. (17) 



Theorem 3. With the above notation. Let n be odd. Let /(x) he the Boolean 
form of the WG transformation f{x) = Tr{g{x -|- 1) -|- 1) defined by (3). Then 
/(x) is r-resilient if and only if all vectors in Wr = {wjl < Hfw) < r} appear 
in R. L.e., Wr C R. 

Proof. Applying Lemma 2, we have /(w) = /(A) where 



n—1 

o(x) = WiXi = Tr{Xx). (18) 

i=0 



Notice that 



Xx = X XiUi Tr{Xx) = Tr{Xai)xi. 



i=0 
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Combining with (18), we have Wi = Tr{Xai), 0 < i < n. According to Corollary 
1, /(A) = 0 if and only if Tr{X^) = 0 where c is defined by (11). Together with 
the definition of the resilient property, the assertion follows. 

Note. Theorem 3 provides a method to find r-resilient WG transformations re- 
garded as Boolean functions. 

In the following we will derive that any WG-transformation regarded as a 
Boolean function with 1-resilient property is always possible by a proper basis 
conversion. 

Theorem 4. Let f{x) he a WG-transformation. Then there exists at least one 
basis of F 2 " such that the Boolean function representation of f{x) under this 
basis is 1-resilient 

Proof. First we state the following claim whose proof can be found in the full 
paper. 

Claim. There are n linearly independent vectors in R, defined by (17). 

Therefore we can assume that {aor'" yCtn-i} is a subset of Ra which is 
linearly independent over F 2 . Let A be an n x n matrix with row vectors Ui,i = 
0, . . . ,n — 1. Let B = A~^ and let fdj denote the jth column vector of B, j = 
0, • • • , n — 1. Then /3 = {/3 q, • • • , Pn-i} is a basis of F 2 " . Then is given by 

R0 = MX)B\\€D}. 

Therefore the row vectors of BA = A~^A = the identity matrix, belong to 
Rg. Hence fg{xQ, • • • , Xn-i), the Boolean representation of f{x) under the basis 
/3, is a 1-resilient function. 



5 Linear Span and Degree of WG Transformations 

Let f{x) = Ga;* be a polynomial function from F 2 f> to F 2 . Let / = {0 < t < 
2” — l|ci yf 0}. The algebraic degree of f{x), denoted as alg{f), is defined as 

^hif) = rnax[i^iyalg{xi) where alg{xi) = H{i). 

Let /(x) be the Boolean form of f{x) and denote the degree of /(x) as deg{f{x)). 

Fact 1 The algebraic degree of f{x), a polynomial function from F 2 n to F 2 , is 
equal to the degree of the Boolean form of the function. I.e., alg{f) = deg((/(x)). 

Linear span of f{x) is said to be the number of non-zero coefficients in f{x) = 
Ga;*, which is introduced by Gong and Golomb in [10]. We denote it as 
LS{f{x)) or simply LS{f) if the context is clear. I.e., LS{f) = |/|. 

Note. The linear span of a polynomial function from F 2 « to F 2 is equal to the 
linear span of the sequence corresponding to the function. 
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Let /(x) be a Boolean function, we will define the linear span of the Boolean 
function /(x) in terms of its polynomial representations. Let 

77 = { all bases of F 2 n over F 2 }. 

For B = {oo, • • • ,a„_i} € 77, we denote fsix) a polynomial representation of 
/(x) with respect to the basis B. We define a linear span of /(x), denoted by 
LS'(/(x)), as 



LS{f{x)) = minB(^nLS{fB{x)). 

Note that for a given polynomial function /(x), the linear span of any Boolean 
representation of f{x) is equal to the linear span of f{x) itself. We will write 
this observation as a lemma for later reference. 

Lemma 4. With the above notation. Let f{x) be a polynomial function of F 2 " 
to F 2 . Let fB{^) be its Boolean form with respect to a basis B o/F 2 « over F 2 . 
Then LS{fB(jf)) = LS{f{x)) for all B € II . 

As Youssef and Gong pointed it out in their recent work [21], the polyno- 
mial representation a complicated Boolean function might be just a monomial 
function (here monomial means that it has only one trace term in (19) which is 
different from the concept of the ordinary monomial which only has exactly one 
term) . Thus a Boolean function must have a large linear span so that it can be 
resistant to the interpolation attack. In the following, we will show that the linear 
span of the Boolean forms of the WG transformations increases exponentially 
with n. 

We have the following result whose proof can be found in [12]. 

Proposition 4. Let f{x) = Tr{g{x+ 1) -I- 1) be the WG transformation defined 
by (3), then 



f{x) = ^Tr{x^) 
i€l 

where I = Ii U I 2 for n = 3k — 1, where 

+ 2 -k t|0 < t < 2'=-! - 3}, and 
I 2 = {2^'= -k 3 -k 2i|0 < i < 2'=-! - 2} 

and where 7 = {1} U 7a U 74 for n = 3fc — 2, where 

7a = {2'=-! -k 2 -k i|0 < i < 2'=-! - 3}, and 
74 = {22'=-! -k 2'=-! -k 2 -k i|0 < i < 2'=-! - 3}. 



(19) 



( 20 ) 



(21) 



Moreover, in each case, all the elements in I belong to distinct cyclotomic cosets 
modulo 2" — 1. 

Note that the trace functions appeared in (19) depend on the coset size of i 
in 7. 
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Theorem 5. Let f{x) = Tr{g{x + 1) + 1) be the WG transformation defined by 
(3), then LS{f{x)), the linear span of f, is given by LS{f{x)) = _ 3 )_ 

Proof. According to Fact 4, all numbers in I belong to different cyclotomic 
cosets modulo 2" — 1 and |/| = 2^ — 3. So we only need to show that any coset 
containing a number in I has full size n. Here we only give a proof for n = 3/c — 1 
and s G Ii, since proofs for the other cases are similar. Note that for s G Ii, the 
binary representation of s has the following pattern 

Index Ol-- - k — 1 k ■ ■ ■ 2k — 2 2k — 1 2k ■■ ■ 3k — 2 = n — 1 

Binary Rep. * + 0 1 O--- 0 

where * can take any value from {0, 1}. Let Cs be a coset containing s, then 

a = {s,s2,... ,s2"«-i} 

where ns is the smallest integer satisfying s2"“ = s mod 2" — 1. According to (22), 
ns = n. I.e., Cs has full size n. Thus for each s G 7, we have the trace function 
appeared in Fact 4 is the trace function from F 2 n to F 2 . Thus LS{Tr{x'^)) = n 
for each i G I. Therefore the result follows. 

Proof of Proposition 3 in Section 3. Since the linear span of a WG sequence b 
is equal to the linear span of the corresponding WG transformation /. I.e., we 
have LS'(b) = LS{f). Applying Theorem 5, the result follows. 

Theorem 6. Let f{x) be the WG transformation defined by (3). Then the lin- 
ear span of any Boolean form of f{x) is equal to the linear span of f{x). I.e., 
LS'(/(x)) =n(2r"/3l -3). 

Proof. The result follows from Lemma 4 and Theorem 5. 



Theorem 7. Let f{x) = Tr{g{x + 1) + 1) be the WG transformation defined by 
(3) and /(x) be its Boolean form. Then deg{ffx)), the degree o//(x), is given 
by degiffx)) = \n/3] + 1. 

Proof. According to Fact 1, the degree of /(x) is equal to the algebraic degree 
of f{x). From the definition, the algebraic degree of f{x) is determined by the 
largest Hamming weight among the integers appeared in the set I in Fact 4. We 
can easily verify that fc + 1 for both n = 3fc — 1 and n = 3fc — 2 is such number. 
Thus the assertion is established. (See the full paper, for a detailed proof.) 

Applying the Siegenthaler inequality [18] and Theorem 7, the following corol- 
lary is immediate. 

Corollary 3. For a given WG transformation regarded as a Boolean function 
in n variables, r, the order of the resilient property of the function, is bounded 
by the following inequality r <n — |"n/3]. 
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6 Example 

In this section, we will give an example to illustrate the randomness properties 
of WG transformations regarded as both WG sequences and Boolean functions, 
which we obtained in the previous sections. Let F27 be defined by a primitive 
polynomial h{x) = -\- x ~\~ 1. Let a be a root of h{x). Then a is a primitive 

element of F27. Since n = 7, then k = 3, and 

qi = 5, 92 = 21, qs = 13, and 94 = 29 and g{x) = x + x^ + + x^^ + 

So, the WG transformation is 

/(x) = Tr{g{x + 1) + 1) = Tr(x + x^ + x^ + x^® + x^®). 

A. Sequence Aspects of the WG transformation: 

We obtain a WG sequence b = {bi} as follows: 

b = 1000000101000011011100010100101100101010000101100010010111011011 
000110011101100100000110001111010101110100100111111100111101111 

where bi = f{W),i = 0, 1, • • • . The WG sequence b has the Balance property, 
ideal 2-tuple distribution, 2-level auto correlation, 3- valued cross correlation with 
an m-sequence defined by {0^} where = Tr(a®), i = 0, 1, • • • which belongs to 
the set { — 1, 15, —17}, the Hadamard transform spectrum belonging to the set 
{0,±16j and linear span 35. 

B. Boolean Function Aspects of the WG transformation: 

Using the polynomial basis (1, a, a^, a®, a®), where -I- a -|- 1 = 0, the 

algebraic normal form of the Boolean function that corresponds to /(x) is given 

by 

Xo©XiX 3 0 XoXiX 3 ©X 2 X 3 ©XoX 2 X 3 ©XiX 2 X 3 0 XoX 3 X 4 ©X 2 X 3 X 4 0 XoX 5 ©XoXiX 50 
X0X2X5 © X1X2X5 © X3X5 © X1X3X5 © X1X4X5 © X2X4X5 © X2X3X4X5 © xixe © X2X6 © 
X0X2X6 © X0X3X6 © X4X6 © X0X4X6© 

X2X4X6 © X1X3X4X6 © X0X5X6 © X1X5X6 © X1X2X5X6 © X0X3X5X6. 

Ghanging the basis using the following basis conversion matrix 



ff3o\ 




/I 1 1 0 0 0 1\ 




/ 1 \ 


Pi 




0100000 




a 


P 2 




0011101 




a® 


P'S 


= 


0001010 




a® 


Pa 




0000100 






Ps 




0000010 




a® 


\P&j 




^0 0 0 0 0 0 1^ 




\«V 



we obtain 

Xq © Xi © X2 © X3 © X0X3 © X0X1X3 © X2X3 © X0X2X3 © X1X2X3 © X3X4 © X1X3X4© 
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X2X3X4 © X5 © X0X5 © xia:5 © a;ia;3a;5 © x^XiX^ © x^x^x^ © X2XzXiX^ © xe © xqXq © 
XqX 2 X% © X1X2XQ © XqXsXqS) 

X1X3XQ © X2X4XQ © XiX^X4X(, © X^Xq © XqX^Xq © XxX^Xq © X2X^Xq © X\X2X^Xq © 
x^x^Xq © a;oa; 3 X 5 a ;6 © X2X3X5XQ © X3X4X5X6 

which is 1-resilient function with nonlinearity 56, algebraic degree 4 and linear 
span 35. 

7 Conclusion 

We provide a table which contains the profiles that we obtained in previous 
sections as a conclusion of this paper. 



Table 1. Profiles of WG transformations 



WG Sequences 
profile 


WG Sequence WG Trans. 

as Boolean Func. 


WG Trans. Boolean 
Profile 


2" - 1 


Period Boolean 


n variables 


Yes 


Balance Balance 


Yes 


Yes 


2-tuple distribution NC 




2-level 


Auto correlation NC 




{ — 1, — 1 di 2 2 n odd 

optimal w.r.t. 
the Welch bound, 


cross correlation ^ Non-linearity 
with m-sequences 


2 "-l _ 2^, n odd 


0, ± 2 ("+L/ 2 , n odd 


Hadamard transform spectrum 


0, ±2("+L/2, n odd 


n(2l"/3l - 3) 
increases 

exponentially in n 


Linear span Linear span 


n(2l*‘/3l - 3) 
increases 

exponentially in n 




NC Degree 

NC r- resilient 


\n/‘i\ +1 

r\l<r<n — |"n/3] 


easy* 


Implementation 


easy 


Ideal candidates for 
combining functions 
Pseudo-random 
sequence generators 


Applications 


Ideal candidates for 
combining functions 
or filtering functions 
operating on 
a set of LFSRs 



Notations used in Table 1: 

— NC means that there is no corresponding concept between them. 

— *: There are two methods to implement WG sequences. One is to use 5 LSFRs together 
with a table look-up (Lemma 1 method). The other is to use a finite field configuration. 
The complexity of implementation of WG sequences by using the finite field configuration 
only depends on evaluation of four exponentiations listed in Section 2. Especially, it only 
depends on the evaluating the exponents qs and q^ where each of them has k—1 consecutive 
I’s. We will discuss how to efficiently compute these two exponentiations at a separate 
paper. {Note. Implementation of the trace function has no cost.) 
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Abstract. A general stream cipher with memory in which each cipher- 
text symbol depends on both the current and previous plaintext symbols, 
as well as each plaintext symbol depends on both the current and previ- 
ous ciphertext symbols, is pointed out. It is shown how to convert any 
keystream generator into a stream cipher with memory and their secu- 
rity is discussed. It is proposed how to construct secure self-synchronizing 
stream ciphers, keyed hash functions, hash functions, and block ciphers 
from any secure stream cipher with memory. Rather new and unusual 
designs can thus be obtained, such as the designs of block ciphers and 
(keyed) hash functions based on clock-controlled shift registers only. 

Key words: Stream ciphers, block ciphers, keyed hash functions, hash 
functions, conversions, security. 



1 Introduction 

The electronic codebook (ECB) mode of block ciphers is mostly confined to en- 
cryption of relatively short messages in cryptographic protocols for confidential- 
ity and/or authentication purposes. To increase the resistance to cryptanalysis, 
various modes with memory have been suggested, such as the output feedback 
(OFB) mode, the cipher block chaining (CBC) mode, the cipher feedback (CFB) 
mode, and the counter mode (e.g., see [22]). If a block cipher is used for encryp- 
tion in one of these modes with memory, it then essentially becomes a stream 
cipher whose next-state and/or output functions are determined by the secret- 
key-dependent encryption and/or decryption functions of the block cipher. 

The CBC and CFB as well as some other modes with memory [20] can be 
used to produce a message authentication code (MAC), usually also called a 
keyed hash function (KHF), which can be combined with encryption as well. It 
is proved in [2] that the CBC-MAC (used without CBC encryption) is at least 
as secure as the underlying block cipher with respect to the well-known pseudo- 
random function probabilistic model. A more general theoretical framework for 
the relative security analysis of symmetric encryption modes is proposed in [3] . 
Block ciphers can also be used in a number of various modes with memory to 
build hash functions (HF’s) (e.g., see [20]). 

* Part of this work was done while the author was with the Information Security 
Research Centre, Queensland University of Technology, Brisbane, Australia 



D.R. Stinson and S. Tavares (Eds.): SAC 2000, LNCS 2012, pp. 233-247, 2001. 
(c) Springer-Verlag Berlin Heidelberg 2001 
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However, stream ciphers need not be based on block ciphers, that is, on 
one-to-one functions that are difficult to compute in both directions without 
knowledge of the secret key. Instead, there exist many proposals of practically 
secure stream ciphers (keystream generators) whose next-state and/or output 
functions are very simple and whose initial state is controlled by the secret key 
(e.g., see [21], [22], and [19]). 

A general way to produce a KHF from a keystream generator and an un- 
conditionally secure MAC (authentication code, which is typically based on 
some linear operations modulo an integer or modulo a polynomial) is to use 
the keystream sequence to define the time- variant MAC secret key (see [23], 
[12], and [14]). A different construction directly based on a keystream generator 
is given in [15], but is shown to be insecure in [23]. The KHF can be combined 
with the keystream generator encryption, but then an additional portion of the 
keystream sequence is required. 

In [1] it is suggested how to build block ciphers of unbalanced Luby-Rackoff 
type, with just a few rounds, from a keystream generator and a KHF or a HF. 
Note that a KHF itself can be produced from a HF by incorporating the secret 
key in the message. Similar constructions are also proposed in [16], along with 
a more elaborate security analysis. 

The main objective of this paper is to show how to construct secure self- 
synchronizing stream ciphers, keyed hash functions, hash functions, and block 
ciphers from secure stream ciphers in a general, simple, and direct way. The 
crucial point is that instead of the keystream generator mode, which is almost 
exclusively treated in the open literature, we make use of a more general stream 
cipher mode in which each ciphertext symbol depends not only on the current 
plaintext symbol, but also on all the previous plaintext symbols. We discuss the 
security of all the modes proposed. 

A formal proof that a derived mode of operation is at least as secure as the 
original mode with respect to certain attacks means that any efficient attack on 
the derived mode gives rise to an efficient attack on the original mode. How- 
ever, if the original mode is not proved to be secure itself, as is the case with 
all known practical stream and block ciphers and (keyed) hash functions in the 
open literature, then formal security reduction proofs only show alternative ways 
of developing efficient attacks on the original mode. If the original mode is con- 
sidered heuristically secure with respect to known attacks, then formal security 
reduction proofs do not imply that the derived modes are such, because they 
may be vulnerable to previously unknown, specially adapted attacks. Moreover, 
in our case, the security of the underlying stream cipher mode with plaintext 
memory has not been considered in the open literature. Accordingly, apart from 
the formal security analysis, we also deal with the practical (heuristic) security 
analysis of the modes proposed. 

As usual (e.g., see [18] and [1]), the formal security statements and proofs 
are presented in a general language, which can be made mathematically precise 
by assuming various mathematical models for cryptanalytic attacks, such as the 
polynomial-complexity algorithms . 
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The stream cipher with memory (SCM) and other stream cipher modes are 
briefly described in Section 2. A general way of converting a stream cipher from 
the keystream generator mode into the SCM mode is proposed in Section 3 
and practical security analysis of both the modes is addressed in Section 4. 
Conversions of a stream cipher with memory into a self-synchronizing stream 
cipher, a keyed hash function, a block cipher, and a hash function are presented 
in Sections 5, 6, 7, and 8, respectively, together with their security analyses. A 
proposal of a simple class of stream ciphers with memory to be used is described 
in Section 9. Conclusions are given in Section 10. 



2 Stream Cipher with Memory (SCM) Mode 

Let X = (xt)“Q and y = denote the plaintext and ciphertext binary 

sequences, respectively. Let the binary strings k and r stand for the secret and 
randomizing keys, respectively, and let s = (st)^o denote the internal state 
sequence where St is a binary string of length M and so(k,r) is the initial internal 
state determined by k and r. Then, a general binary stream cipher decipherable 
without delay is an invertible nonautonomous finite-state machine with one input 
and one output that maps an input sequence x into the output sequence y by 
the encryption (sequential) transform recursively defined by 

St-K = Fk{st,xt), yt = xt + fk{st), t>0 (1) 

where the addition is binary and ■ {0, l}^+i — >• {0, and fk '■ {0, 1}^ — >■ 

{0, 1} are the (secret-key-dependent) next-state and output functions, respec- 
tively. The inverse, decryption transform is recursively defined by 

— Fki^St^yt -\- Xt — Ut F fk{_^t)t tFQ, (2) 

The encryption and decryption transforms are thus defined by the so-called 
keystream sequence z = {zt = /fc(st))fco- 

Let the input memories of the encryption and decryption transforms be re- 
ferred to as the input and output memories of a stream cipher, respectively. All 
stream ciphers can be classified into the following three types with respect to the 
input and output memories: memoryless, with finite input or output memory, 
and with infinite input and output memory. 

In the memoryless type, known as the keystream generator (KG) or the pseu- 
dorandom sequence generator type, the next-state function does not depend on 
the plaintext symbol, that is, St+i = Ffc(st), so that the keystream sequence z 
is plaintext independent. The KG type is sensitive to synchronization errors but 
has no substitution error propagation. To deal with a possible loss of synchro- 
nization when encrypting longer messages with the same secret key fe, a long 
message is divided in shorter ones and a randomizing key r is used to reinitialize 
the keystream generator for every new message to be encrypted. It is typically 
sent in the clear and as such is public rather than secret. The randomizing key 
is generated in a random or deterministic way with the property of being with 
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high probability different for every new message to be encrypted with the same 
secret key. This is in order to satisfy the so-called one-time-pad assumption that 
repetitions of long segments of keystream are highly unlikely. Namely, to this 
end, every new plaintext sequence should with high probability be encrypted 
by using a different initial state. More generally, repetitions of internal states 
should be highly unlikely. 

In the finite input or output memory type, the encryption or decryption 
transform has finite input memory. In particular, in the self-synchronizing stream 
cipher (S^SC) type (i.e., the cipher feedback type), the decryption transform has 
finite input memory, that is, Sj+i = (yt-i)^o^, so that the keystream sequence 
depends on ciphertext only. As a consequence, the propagation of both synchro- 
nization and substitution errors in decryption is limited. Its security entirely de- 
pends on the output (feedback) function fk which must be secret-key-dependent. 

In the infinite input and output memory type, the next-state function ef- 
fectively depends on the current plaintext symbol in such way that both the 
encryption and decryption transforms have infinite input memory. This type is 
typically either not mentioned or just overlooked as a practical possibility in the 
open literature on stream ciphers (e.g., see [21], [22], and [19]). Such a type is 
called here the stream cipher with memory (SCM) type, to emphasize the fact 
that in encryption each ciphertext bit does not depend on the current plaintext 
bit only, but also on the previous plaintext bits as well as that in decryption each 
plaintext bit depends on the current and previous ciphertext bits. Note that the 
PKZIP stream cipher is of this type (see [5]). 

The SCM type is therefore sensitive to both synchronization and substitution 
errors, but, due to infinite input memory, has an inherent potential that can be 
used for message integrity purposes. The errors on real channels should be dealt 
with by separate error-correction and/or detection codes and, in addition, by 
the resynchronization method. However, instead of using and transmitting the 
randomizing key as described above, one may just prepend the randomizing key 
to each new message and keep the initial state secret-key-dependent only, as is 
done in the PKZIP stream cipher. In this case, the randomizing key is encrypted 
rather than transmitted in the clear and hence need not be public. 

The basic types of stream ciphers will also be referred to as the modes of 
operation of stream ciphers, because it will be shown that they can be converted 
into each other by simple and general constructions. Similarly, other crypto- 
graphic primitives constructed from stream ciphers will also be referred to as 
the modes of operation of stream ciphers. 



3 Conversion of Keystream Generator (KG) Mode 
into SCM Mode 

Any stream cipher in the KG mode can be converted into the SCM mode by 
letting the next-state function depend upon the current plaintext bit too. The 
main practical criterion to be respected in this regard is that a change of a 
single plaintext bit should give rise to a random looking change in the keystream 
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(ciphertext) sequence to follow (forward propagation effect). Note that the KG 
mode should satisfy the same property, but with respect to changes of the initial 
state bits. So, the adaptation is easily achieved by adding the plaintext bit to one 
or more of the internal state bits, especially those with the significant forward 
propagation effect. For a nonbinary plaintext alphabet, the plaintext symbol 
should be incorporated by a function with the property that any change of this 
symbol necessarily gives rise to a change of one or more of the internal state 
variables. This can generally be achieved by using a quasigroup operation. 

In shift-register-based keystream generators, the conversion is easily done by 
making the shift registers nonautonomous, that is, by adding the plaintext bit 
to the feedback bit for each of the shift registers, and especially those that are 
used for clock control. The output of such shift registers should then necessarily 
involve the first, input stage. In KGs based on the table-shuffling principle like 
RG4 (see [22]), the plaintext symbol should affect the internal state variables 
defining the table positions where the changes should be made. 

4 Security Analysis of SCM and KG Modes 

In view of (1), the change of one plaintext bit necessarily causes the change 
of the corresponding ciphertext bit in the KG and SGM modes as well as a 
pseudorandom change of only the subsequent ciphertext bits in the SGM mode. 
Therefore, without the one-time-pad assumption, cryptanalytic attacks generat- 
ing new plaintext /ciphertext pairs by modifying some bits in the known pairs are 
feasible. This is relevant for achieving information authenticity. The one-time- 
pad assumption can be achieved either without using resynchronization or by 
using resynchronization by (with high probability) different randomizing keys. 

We will discuss cryptanalytic attacks in a general, adaptive combined cho- 
sen plaintext and ciphertext scenario, possibly key related as well (e.g., see [18]). 
The situation is conceptually similar to one with block ciphers in the EGB mode, 
with a difference that we now deal with the plaintext/ciphertext strings rather 
than blocks and that each plaintext sequence effectively includes the randomiz- 
ing key used. The attacks use a training set consisting of a number of known, 
arbitrarily, possibly adaptively, chosen plaintext/ciphertext string pairs obtained 
from the same secret key (and the same or different known randomizing keys), 
and possibly from a set of related keys as well. An attack is an algorithm that 
produces one or more new plaintext /ciphertext string pairs, not included in the 
training set, where either the plaintext or ciphertext string in each of these pairs 
is assumed to be given. As knowing a plaintext/ciphertext string pair in the 
KG mode is equivalent to knowing a keystream/ciphertext string pair, an attack 
on the KG mode is in fact an algorithm that reconstructs unknown portions of 
a keystream sequence from its known portions or, more generally, from known 
portions of a set of keystream sequences obtained from different randomizing 
keys and the same secret key. 

Typically, it is assumed that the attacks work for any secret key. Ideally, the 
exhaustive search over the secret keys or over unknown plaintext or ciphertext 
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strings should be the most efficient way to perform an attack. If an attack can 
recover only particular plaintexts from given ciphertexts, it is then said to be 
successful for such plaintexts only, e.g., corresponding to certain plaintext statis- 
tics. On the other hand, secret key reconstruction attacks are the best known 
examples of the attacks successful for all the plaintexts. 

An attack on the SCM mode that produces new plaintext/ciphertext string 
pairs by modifying a number of end bits in a known plaintext/ciphertext string 
pair is said to be trivial. A trivial attack on the KG mode is defined similarly. As 
argued above, trivial attacks are always computationally feasible for any stream 
cipher in the SCM or KG mode if the one-time-pad assumption is not satisfied. 

A stream cipher in the SCM or KG mode is said to be (practically) secure with 
respect to nontrivial cryptanalytic attacks if no suck attack is computationally 
feasible, relative to available computing power. This security essentially means 
that both the encryption and decryption transforms are infeasible to compute 
without knowing the secret key, under the one-time-pad assumption. 

Since it does not appear possible to formally relate the security of the KG 
and SCM modes with respect to general attacks, we will concentrate on their 
practical security. It turns out that the security of the SCM mode is more related 
to the security of the KG mode with than without resynchronization, which, 
except for [6], is hardly analyzed in the open literature. 

The fact that in the SCM mode the plaintext statistics is disguised by the 
plaintext dependent keystream sequence makes the cryptanalytic attacks suc- 
cessful for particular plaintexts only and the ciphertext-only cryptanalytic at- 
tacks both very unlikely. For the same reason, the attacks consisting in predicting 
the keystream without reconstructing the initial state (e.g., based on low linear 
[21] or 2-adic [13] complexities) and the attacks finding the statistical weaknesses 
of the keystream (e.g., [9]) are also unlikely to succeed. In addition, unlike the 
KG mode, assuming that the (prepended) randomizing key is known is not very 
realistic in the SCM mode. Also, the known plaintext/ciphertext scenario is gen- 
erally less useful for the SCM than for the KG mode, as missing portions of the 
plaintext/ciphertext sequences present a difficulty which is harder to overcome 
in the SCM than in the KG mode. 

On the other hand, the plaintext dependent keystream sequence may open 
new possibilities for secret key reconstruction attacks especially if resynchro- 
nization is used and if prepended resynchronization key is assumed to be known, 
although the cryptanalysis is more complicated. For a survey of various (initial 
state) secret key reconstruction attacks on the KG mode without resynchroniza- 
tion, see [21], [7], [10], and [19]. 

Since the randomizing key plays the role of known plaintext, the cryptanalytic 
methods for block ciphers in the ECB mode, such as the differential [4] and linear 
[17] cryptanalysis, in principle extend to the KG and SCM modes too, especially 
if the secret and randomizing keys are linearly combined together. They are 
less likely to succeed here because of the underlying iterative structure. On the 
other hand, if long all zero plaintext sequences are allowed to be encrypted, 
then, formally, any secret key reconstruction attack on the KG mode directly 
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extends to the SCM mode. So, the basic criterion for the SCM mode is that the 
underlying KG mode is secure (e.g., this is not true for the PKZIP stream cipher 
cryptanalyzed in [5]). 

If the prepended randomizing key is assumed to be known, then the SCM 
mode is required to be secure with respect to nontrivial cryptanalytic attacks 
(in particular, secret key reconstruction attacks), for any training set of known 
plaintext/ciphertext sequence pairs produced from the same initial state. To 
this end, when the plaintext symbol is introduced into a part of the next-state 
function of the underlying KG mode by a linear function, it appears reasonable 
to recommend that this part of the next-state function should not be linear, as 
a whole, with respect to the same type of linearity. Accordingly, in binary shift- 
register-based keystream generators at least one of the shift registers affected by 
the plaintext bit should be clock controlled or have nonlinear feedback. 

5 Conversion of SCM Mode 

into Self-Synchronizing Stream Cipher (S^SC) Mode 

We should define the secret-key-controlled feedback function fk of m binary 
variables, where m is the output memory size, on the basis of the SGM mode 
of a given stream cipher so as to prevent trivial attacks on the SGM mode. A 
general and simple way to achieve this is to use an SGM mode with the secret key 
k, with the initial state determined only by k, and with the plaintext sequence 
whose first m plaintext bits are defined by a given m-bit input to fk and the 
remaining plaintext bits are fixed, possibly to zero. The output bit of fk is then 
defined as the last ciphertext bit obtained after clocking the SGM mode m times 
and a specified additional number of times, typically, on the order of several (e.g., 
three) internal memory sizes M of the SGM mode. Similarly, a faster conversion 
is obtained by producing a block of n, n < M, ciphertext bits at a time, thus 
constructing an n-bit output feedback function fk- In the S^SG mode, its output 
is then combined with an n-bit block of plaintext at a time, e.g., by using the 
bitwise binary addition. 

Any attack on the S^SG mode is essentially an algorithm for producing the 
unknown outputs of the feedback function for one or more given inputs, where 
a training set of a number of input/output pairs of the feedback function is 
assumed to be known. The S^SG mode of a stream cipher is said to be secure 
if no such attack is computationally feasible. In particular, the security can be 
defined with respect to secret key reconstruction attacks only. 

Proposition 1. If the underlying SGM mode is secure with respect to nontrivial 
cryptanalytic attacks, then the derived S^SG mode is secure. If the underlying 
SGM mode is secure with respect to secret key reconstruction cryptanalytic at- 
tacks, so is the derived S^SG mode. 

Proof. It should be proved that any computationally feasible attack on the S^SG 
mode can be converted into a nontrivial attack on the SGM mode at an addi- 
tional computational cost that is not comparatively significant. This is a direct 
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consequence of the proposed construction, as any input/output pair for the feed- 
back function can be extended into a plaintext/ciphertext string pair for the 
SCM mode by appending a required number of fixed plaintext bits to the input 
bits, where only the last bit (or the last n bits) of ciphertext is known. Accord- 
ingly, one can modify the training set for any attack on the S^SC mode into the 
corresponding training set for an attack on the SCM mode and vice versa. As 
well, producing an unknown input/output pair of the feedback function directly 
extends into producing an unknown bit (or n bits) of ciphertext for a known 
plaintext string of the SCM mode. This is by definition a nontrivial attack on 
the SCM mode, since sufficiently many fixed plaintext bits are appended. 

A similar proof holds for secret key reconstruction attacks. □ 

In other words, the derived S^SC mode is at least as secure as the underlying 
SCM mode with respect to nontrivial attacks, as trivial attacks on the SCM 
mode are prevented by additional clocking. 



6 Conversion of SCM Mode 

into Keyed Hash Fnnction (KHF) Mode 

A binary keyed hash function (KHF) or a message authentication code (MAC) 
is a secret-key-dependent function {0,1}^ — >■ {0,1}" that maps binary strings of 
variable length I (messages) into binary strings of a fixed length n, where I > n. 
This function should be easy to compute when the secret key is known. The 
main computational security property required from a KHF is that it should 
be computationally infeasible, without knowledge of fc, to perform existential 
forgery in the (adaptive) chosen message scenario, that is, to find another, dis- 
tinct message and its hash value under fc, provided a set of (adaptively) chosen 
messages and their hash values under k is given. In particular, a KHF should 
be secure against any secret key reconstruction attack. Ideally, the exhaustive 
search over k should be the most efficient way to find k. 

Our objective now is to show how to construct a KHF from a stream cipher 
in the SCM mode. The designs proposed in the literature are either dedicated 
or are based on block ciphers, hash functions, or unconditionally secure MACs 
combined with KGs. They typically require that the message length be a multiple 
of a given positive integer which is achieved by padding. Our construction allows 
an arbitrary message length and is solely based on a stream cipher in the SCM 
mode, which can be easily obtained from any KG mode (as shown in Section 3). 

The construction is similar to the one proposed for the S^SC mode. Let k be 
the secret key for the SCM mode of a given stream cipher with the initial state 
determined only by k and with the plaintext sequence whose first I plaintext 
bits are defined by the given message and the remaining bits are fixed, possibly 
to zero. Let M > n, where M is the internal memory size of the SCM mode. 
The hash value is then defined as the last n successive ciphertext bits obtained 
after clocking the SCM mode I times and a specified additional number of times, 
several (e.g., three) times bigger than M. 
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In a similar way as for the S^SC mode, producing a message and the cor- 
responding hash value under k for the defined KHF mode directly extends into 
producing n unknown ciphertext bits for a known plaintext string of the under- 
lying SCM mode. Consequently, the following proposition is proved in essentially 
the same way as Proposition 1. 

Proposition 2. If the underlying SCM mode is secure with respect to nontriv- 
ial cryptanalytic attacks, then the derived KHF mode is secure. If the underlying 
SCM mode is secure with respect to secret key reconstruction cryptanalytic at- 
tacks, so is the derived KHF mode. 

In other words, the derived KHF mode is at least as secure as the underlying 
SCM mode with respect to nontrivial attacks, as trivial attacks on the SCM 
mode are prevented by additional clocking. The same proposition holds even if 
the hash value of a given message is used together with the corresponding cipher- 
text obtained by the SCM with the same k. Accordingly, the KHF mode can be 
combined with the SCM encryption with the same secret key. Additional protec- 
tion measures or constraints typically required for the designs proposed in the 
literature (e.g., due to the finite input memory of the CBC or CFB decryption) 
are not here needed. 

7 Conversion of SCM Mode 
into Block Cipher (BC) Mode 

A binary block cipher is a secret-key-dependent one-to-one function {0,1}" — ^ 
{0, 1}" that maps binary strings of length n (plaintext blocks) into binary strings 
of the same length (ciphertext blocks), where the block length n is usually fixed 
and relatively short. The function is called the encryption function, and its in- 
verse is called the decryption function. Both the functions should be easy to 
compute when the secret key k is known and infeasible to compute when k is 
not known, in the (adaptive) chosen plaintext/ciphertext scenario, possibly key 
related as well. More precisely, an attack on a block cipher is an algorithm that, 
on the basis of a training set consisting of a number of known, arbitrarily chosen 
plaintext/ciphertext block pairs, produces one or more new plaintext/ciphertext 
block pairs, where either plaintext or ciphertext blocks are assumed to be given. 
A block cipher is said to be secure with respect to cryptanalytic attacks if no 
suck attack is computationally feasible. 

Typically, the objective of cryptanalytic attacks on block ciphers is to re- 
construct the secret key, in which case the attacks are successful for all the 
plaintexts. The differential cryptanalysis [4] in the chosen plaintext scenario and 
the linear cryptanalysis [17] in the known plaintext scenario are the well-known 
examples. The vast majority of existing proposals for block ciphers use the prod- 
uct structure composed of a number of rounds each involving a relatively simple 
one-round function, such as the Feistel type ciphers like DES. 

We will now describe a simple and general way to construct a secure block 
cipher (BC) mode starting from any secure stream cipher in the SCM mode. 
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The construction essentially requires only three rounds and works for variable 
block sizes that are practically limited only by the memory space available. We 
are interested in the SCM mode whose initial state depends on the secret key 
only, without using any randomizing key to satisfy the one-time-pad assumption. 
Such a mode is not secure with respect to the so-called trivial attacks, as it is 
possible to generate, with high probability of success, new plaintext/ciphertext 
string pairs by modifying a number of end bits in the plaintext/ciphertext string 
pairs already known. So, what is essentially needed is a construction that would 
prevent such attacks. 

We propose a product connection of three stream ciphers in the same SCM 
mode whose secret keys are the same, independent, or different (but related). 
Product ciphers with independent keys are usually called cascade ciphers (see 
[18]). In fact, we suggest different keys related in a very simple way, which 
removes the need for a special key schedule. For example, the keys for the second 
and the third cipher can be obtained as cyclic shifts of the key for the first stream 
cipher (represented as a binary string) . The main point is to use the output bits 
from the first/second stream cipher in the reverse order to define the input 
bits for the second/third stream cipher, respectively. This creates the required 
forward propagation effect for the second half of the plaintext bits, as noted in 
[11] in a different context of block ciphers based on specific finite automata. It is 
required to memorize the outputs of the first and the second stream cipher. One 
can also use a product connection of only two stream ciphers, but in this case 
the change of the last plaintext bit necessarily gives rise to the change of the first 
ciphertext bit, which may be considered as a weakness for some applications. 

For the encryption of long messages, the proposed BC mode can be used in 
its basic, ECB form with a large block size. Also, it can be used in the usual 
CBC and CFB modes, with a finite input memory of the decryption transform. 
This may be suitable for some applications, e.g., for the encryption of random 
access files. In addition, one may also use the existing internal memory of the 
underlying stream cipher. For example, one can use the last generated internal 
state for the current plaintext block as the initial state for the next one. The 
obtained cipher can then be considered as an enhanced version of the original 
stream cipher in the SCM mode, because of the triple encryption. 

By using the standard argument (e.g., see [18]), we obtain the following two 
propositions, which essentially show that the BC mode is at least as secure as 
the underlying SCM mode with respect to nontrivial attacks, as trivial attacks 
on the SCM mode are prevented by reversing the intermediate ciphertexts. It is 
assumed that the attacks work for any secret key. 

Proposition 3. If the underlying SCM mode is secure with respect to nontriv- 
ial cryptanalytic attacks, then the derived BC mode with the cascade connection 
is secure. If the underlying SCM mode is secure with respect to secret key re- 
construction cryptanalytic attacks, so is the derived BC mode with the cascade 
connection. 

Proof. It should be proved that any computationally feasible attack on the BC 
mode with the cascade connection can be converted into a nontrivial attack on 
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the SCM mode at an additional computational cost that is not comparatively 
significant. Consider the BC mode where the secret key for the first stream cipher 
is unknown and arbitrary and where the keys for the second and the third stream 
cipher are known and fixed. Then any plaintext/ciphertext block pair for the BC 
mode can be converted into the corresponding plaintext/ciphertext string pair 
for the SCM mode of the first stream cipher and vice versa, at a computational 
cost of two SCM encryptions/decryptions (with known keys) per pair. 

Accordingly, one can modify the training set for any attack on the BC mode 
into the corresponding training set for an attack on the SCM mode and vice 
versa. Also, producing an unknown plaintext/ciphertext block pair for the BC 
mode directly extends into producing an unknown ciphertext string for a known 
plaintext string for the SCM mode. This is a nontrivial attack on the SCM 
mode, since the reversion of intermediate ciphertexts renders the produced plain- 
text/ciphertext string pair(s) different from those obtained by trivial attacks 
(i.e., by modifying pairs from the training set). 

A similar proof holds for secret key reconstruction attacks. □ 

Proposition 4. If the underlying SCM mode is secure with respect to secret 
key reconstruction cryptanalytic attacks in the related key scenario, then the 
derived BC mode with the product connection is secure with respect to secret key 
reconstruction cryptanalytic attacks. 

Proof. It should be proved that any computationally feasible secret key recon- 
struction attack on the BC mode with the product connection can be converted 
into a secret key reconstruction attack on the underlying SCM mode at an ad- 
ditional computational cost that is not comparatively significant. As the keys 
of the second and the third SCM mode are derived from the secret key of the 
first SCM mode, any given plaintext/ciphertext block pair for the BC mode 
can be converted into the corresponding plaintext/ciphertext string pair for the 
first SCM mode and vice versa if two more plaintext/ciphertext string pairs 
are known: one for the second SCM mode and one for the third SCM mode in 
the product connection, both obtained from the (related) keys derived from the 
secret key of the first SCM mode. 

Accordingly, the training set for any secret key reconstruction attack on 
the BC mode can be obtained from the corresponding training set for the 
SCM mode with the same secret key and from two additional training sets for 
the SCM modes with related keys. This is achieved by combining the plain- 
text/ciphertext string pairs from the three training sets into the corresponding 
plaintext/ciphertext block pairs for the BC mode, respectively. The secret key 
for the first SCM mode can then be produced by the secret key reconstruction 
attack on the corresponding BC mode. □ 

Note that in a special case. Proposition 3 is true for cryptanalytic attacks 
successful for particular plaintexts only, as the plaintexts for the first SCM mode 
in the cascade and for the whole cascade are the same (see [18]). If very simple 
stream ciphers are used so that the security of the SCM mode may be question- 
able, then the number of rounds can be increased. 



244 



Jovan Dj. Golic 



8 Conversion of SCM Mode 

into Hash Function (HF) Mode 

A binary hash function (HF) is defined in the same way as a KHF except that 
the secret key parameter is not used. One-wayness and collision-resistance (or 
collision-freedom) are the two main computational security properties required 
from a HF. A HF is called one-way if it is computationally infeasible to find any 
input that hashes to a given output, for almost all outputs. Ideally, the random 
guessing, with the computational complexity 0(2"), should be the most efficient 
way of inverting a HF. A HF is called collision-resistant if it is computationally 
infeasible to find any two distinct inputs that hash to the same output. Ide- 
ally, the birthday attack, with the computational complexity 0(2"/^), should be 
the most efficient way of producing collisions. Note that the existing proposals 
for HF’s are either dedicated or are based on block ciphers like DES, and are 
typically defined in terms of a so-called compression function which is applied 
iteratively (e.g., see [20] and [1]). 

By fixing the value of the secret key, any KHF obtained from the SCM mode 
of a given stream cipher becomes a candidate HF. However, its security is no 
longer guaranteed by the security of the SCM mode. Namely, the one-wayness 
and collision-resistance properties impose stronger requirements for the SCM 
mode which do not involve the output function of the SCM mode at all and are 
hence more difficult to satisfy. For example, collision-resistance implies that it 
should be computationally infeasible to find any two different plaintext sequences 
that will, starting from the same initial state, produce the same internal state 
at a given time in future. 

We now define a more complicated, but still simple construction of a HF from 
the SCM mode whose security relies on the output function as well. The basic 
construction consists of two stages. The first stage is similar to one for a KHF, 
except that the plaintext sequence for the SCM mode consists of the I message 
bits only and that the (secret) key is fixed and known. The SCM is clocked I 
times and the corresponding I bits of the ciphertext are memorized. In the second 
stage, the I ciphertext bits in the reverse order are used as the plaintext sequence 
for the same SCM mode, but now starting from the last internal state produced 
in the first stage. The SCM is clocked I times and an additional number of times 
several times bigger than M (as before), and the last n successive ciphertext bits 
produced are the hash value. 

As in the BC mode, the ciphertext bits are used in the reverse order to 
increase the forward propagation effect for the second half of the message bits. 
Since finding the collisions necessarily involves the output function of the SCM 
mode, the constructed HF seems to be, at least heuristically, at least as secure 
as the underlying SCM mode with respect to nontrivial cryptanalytic attacks. 
Clearly, the number of stages can be made bigger than two by proceeding in a 
similar way. This would increase the security. 

If the underlying SCM mode is secure, then the resulting KHF and HF modes 
also satisfy a stronger security property that any change of the message bits gives 
rise to a random looking change of the corresponding hash value. 
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As the basic construction described above requires memorizing intermediate 
ciphertext (s) of the same length as the message itself, a derived construction with 
lesser memory requirements would be to use the basic construction to define the 
compression function which is applied iteratively in the usual way. Namely, let 
the message be divided into blocks of a given, relatively large length I with 
the last block of variable length as required (without any padding). Let, for 
simplicity, the memory size of the SCM mode utilized be equal to the hash value 
length n. Then each round of the iterative construction is the basic construction 
applied to a new message block and with the hash value from the previous 
iteration as the initial internal state. 

9 Proposal 

The underlying keystream generator to be used in the proposed constructions 
can be as simple as a self-clock-controlled nonlinear filter generator, where ir- 
regular clocking is needed to ensure that the next-state function is nonlinear. 
The nonlinear filter generator, when regularly clocked, should be designed so as 
to resist known initial state reconstruction attacks including the fast correlation 
attack, the conditional correlation attack, and the inversion attack as well as to 
achieve (with a high probability) a long period, a high linear complexity, and 
good statistical properties of the output sequence (see [8] ) . We propose that the 
LFSR length be at least 256 and that the LFSR initial state be defined by the 
secret key at least 128 bits long. As the next-state function is not one-to-one 
due to irregular clocking, one may expect a reduced period of the keystream 
sequence, but not less than about 2^^®, which is long enough even for stream 
cipher applications. 

The binary clock-control output is produced by an additional boolean func- 
tion with a few inputs (e.g., three) taken from the LFSR taps chosen according 
to a full positive difference set. The difference sets used for clock control and 
for the filter function should be disjoint, as in a nonlinear filter generator with 
two binary outputs (see [8]). According to the binary clock-control output, the 
LFSR is clocked once or twice per each output bit. The SCM mode is formed 
by adding the plaintext bit to the feedback bit at each time, with the plaintext 
bit repeated if the LFSR is clocked twice. 

10 Conclusions 

A general stream cipher with memory (SCM) mode, which is typically over- 
looked in the open literature, is pointed out. Its main characteristic is that each 
ciphertext symbol depends on both the current and previous plaintext symbols. 
Similarly, in decryption, each plaintext symbol depends on the current and pre- 
vious ciphertext symbols. It is shown how to convert any keystream generator 
(KG) mode into the SCM mode and their practical security is discussed. In- 
vestigating the practical security of the SCM mode of stream ciphers is a new 
interesting research area in public cryptology. Developing attacks on the SCM 
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mode would reveal weaknesses of the underlying KG mode, especially when used 
with resynchronization. 

It is proposed how to obtain a secure self-synchronizing stream cipher from 
any secure stream cipher in the SCM mode. It is then proposed how to construct 
secure keyed hash functions, block ciphers, and hash functions from any secure 
stream cipher in the SCM mode. In all the modes, the message length can be 
made large and variable in a simple and natural way. 

The resulting designs are rather new and unusual and are based on the iter- 
ative structure of stream ciphers which is symbol rather than block based. The 
way the secret key is incorporated is new too. For example, there is no need 
for specially designed S-boxes or a special key schedule algorithm. In particular, 
the underlying stream cipher can be as simple as a single self-clock-controlled 
nonlinear filter generator, with the secret key controlling its initial state only. 

All the constructions directly extend from the binary to an arbitrary plain- 
text/ciphertext alphabet, e.g., to stream ciphers based on multiple rather than 
individual shift registers which are suitable for software realizations. 
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Abstract. A family of keystream generators, called the LILI keystream 
generators, is proposed for use in stream cipher applications and the se- 
curity of these generators is investigated with respect to currently known 
attacks. The design is simple and scalable, based on two binary linear 
feedback shift registers combined in a simple way, using both irregular 
clocking and nonlinear functions. The design provides the basic security 
requirements such as a long period and high linear complexity, and is 
resistant to known cryptanalytic attacks. 



1 Introduction 

In this paper, a family of keystream generators based on irregularly clocked LF- 
SRs, intended for use in stream cipher applications, is proposed. We call these 
the LILI generators. The security of the LILI keystream generators is investi- 
gated with respect to currently known attacks on stream ciphers. The keystreams 
produced are shown to possess the basic security requirements for cryptographic 
sequences, such as a long period and high linear complexity. It is shown that, 
provided suitable parameters are selected, the generators are resistant to cur- 
rently known cryptanalytic attacks. Security implications of parameter selection 
are discussed. 

The LILI family of keystream generators are based on two binary linear 
feedback shift registers (LFSRs). Many keystream generator designs are based 
on shift registers, both for the simplicity and speed of LFSR implementation 
in hardware and for the long period and good statistical properties LFSR se- 
quences possess. To make use of the good keystream properties while avoiding 
the inherent linear predictability of LFSR sequences, many constructions intro- 
duce nonlinearity, by applying a nonlinear function to the outputs of regularly 
clocked LFSRs or by irregular clocking of the LFSRs [13]. However, keystream 
generators using regularly clocked LFSRs are susceptible to correlation attacks, 
including fast correlation attacks, a concept first introduced in [11]. In a fast 
correlation attack, the initial states of the component shift registers are recon- 
structed from a known segment of the generator output sequence, without per- 
forming a blind search over all possible shift register initial states. As a means of 
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achieving immunity to these correlation attacks, keystream generators consist- 
ing of irregularly clocked LFSRs were proposed. These keystream generators are 
also susceptible to certain correlation attacks, such as the generalised correlation 
attack proposed in [6]. However, no fast correlation attacks on these generators 
have been published. 

As correlation attacks have been successful against keystream generators 
based on either a nonlinear function of regularly clocked LFSR sequences [16,14] 
or on irregular clocking of LFSRs [6,17], both approaches are combined for the 
LILI keystream generators. The use of both nonlinear functions and irregular 
clocking is not novel, having been employed in previous constructions such as 
ORYX [19] and SOBER [12]. Both ORYX and SOBER are designs for single 
generators, with fixed size LFSRs and fixed combining functions. In contrast, 
this proposal is scalable and so describes a family of keystream generators. Also, 
weaknesses in the design of ORYX resulted in the provision of a very low level 
of cryptographic security [20]. Some attacks on the SOBER proposal have also 
been identified [3]. Although the design for the LILI keystream generators de- 
scribed in this paper is conceptually simple, it produces output sequences with 
provable properties with respect to basic cryptographic security requirements 
and also provides security against currently known cryptanalytic attacks. 



2 Description of LILI Keystream Generators 

The LILI keystream generators are simple and fast keystream generators that 
use two binary LFSRs and two functions to generate a pseudorandom binary 
keystream sequence, as illustrated in Figure 1. The components of the keystream 
generator can be grouped into two subsystems based on the functions they per- 
form: clock control and data generation. The LFSR for the clock-control sub- 
system is regularly clocked. The output of this subsystem is an integer sequence 
which controls the clocking of the LFSR within the data-generation subsystem. 
If regularly clocked, the data-generation subsystem is a simple nonlinearly fil- 
tered LFSR [13] (nonlinear filter generator). Hence the LILI generator may be 
viewed as a clock-controlled nonlinear filter generator. Such a system, with the 
clock control provided by a stop-and-go generator, was examined in [4]. How- 
ever, the use of stop-and-go clocking produces repetition of the nonlinear filter 
generator output in the keystream, which may permit attacks. This system is 
an improvement on that proposal, as stop-and-go clocking is avoided. 

The clock-control subsystem of the keystream generator uses a pseudorandom 
binary sequence produced by a regularly clocked LFSR, LFSRc, of length Lc 
and a function, fc, operating on the contents of k stages of LFSRc to produce 
a pseudorandom integer sequence, c = {c(t)}“j^. For practical applications, it is 
assumed that the feedback polynomial of LFSRc is primitive and that the initial 
state of LFSRc is not the all zero state. Then LFSRc produces a maximum- 
length sequence of period Pc = 2'^= — I. At time instant t, the contents of 
a fixed set of k stages of LFSRc are input to fc and the output of fc is an 
integer c{t), such that c{t) £{1,2,..., 2^}. The function fc is a bijective mapping 
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Fig. 1. LILI keystream generator 



{0, 1}^' — >■ {1, . . . , 2^}, so that the distribution of integers c{t) is close to uniform. 
Thus c = {c(t )}“2 is a periodic integer sequence with period equal to Pc- For 
example, fc{xi, . . • , Xk) = 1 + xi + 2 x 2 + . . . + 2^~^Xk is appropriate. 

The variable parameters in the clock-control subsystem are Lc, the feedback 
function of LFSRc, k, the positions of k stages of LFSRc used as inputs to the 
clocking function fc and fc itself. 

The data-generation subsystem of the keystream generator uses the integer 
sequence c produced by the clock-control subsystem to control the clocking of a 
binary LFSR, LFSRd, of length At time instant t, LFSRd is clocked c{t) 
times. The contents of a fixed set of n stages of LFSRd are input to a Boolean 
function, fd- The binary output of fd forms the keystream bit z{t). After z{t) 
is produced, LFSRc is clocked and the process repeated to form the keystream 

If LFSRd is regularly clocked, then the data-generation subsystem is simply a 
nonlinear filter generator. It is assumed that the feedback polynomial of LFSRd 
is primitive and that the initial state of LFSRd is not the all zero state. Then 
LFSRd produces a maximum- length sequence of period Pd = 2^'' — 1. The output 
of a regularly clocked nonlinear filter generator is a periodic binary sequence, 
g = with period dividing Pd- The following basic result is proved in 

[ 18 ]. 

Theorem 1. Let LFSRd have a primitive feedback polynomial and a nonzero 
initial state. If fd is balanced, or if Pd is a prime and fd is not a constant 
function (zero or one), then the period of g is Pd- 

Now, considering the irregular clocking of LFSRd, the keystream z may be 
viewed as an irregularly decimated version of the nonlinearly filtered LFSRd 
sequence g, with the decimation under the control of LFSRc, so that z{t) = 

The variable parameters in the data-generation subsystem are Ld, the feed- 
back function of LFSRd, n, the positions of n stages of LFSRd used as inputs 
to the filter function fd and fd itself. The function fd should be balanced, highly 
nonlinear and offer some order of correlation immunity relative to the positions 
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of n stages used as inputs to fd (see [9]). The nonlinearity of a Boolean function 
is defined to be the minimum Hamming distance between the function and any 
affine function of the same inputs. The correlation-immunity order of a Boolean 
function is defined to be the maximum nonnegative integer m such that the 
output is statistically independent of any subset of m inputs, provided that the 
inputs are uniformly distributed and statistically independent. 

3 Keystream Properties 

Several properties of pseudorandom binary sequences are considered basic se- 
curity requirements: a sequence that does not possess these properties is gener- 
ally considered unsuitable for cryptographic applications. Basic requirements for 
pseudorandom binary sequences are a long period, high linear complexity and 
good statistics regarding the distribution of zeroes and ones in the output. High 
linear complexity avoids an attack using the Berlekamp-Massey [10] algorithm, 
which requires a length of keystream only twice the linear complexity of the 
sequence to produce the entire keystream. A bias in the distribution of zeroes 
and ones in the keystream can be used to reduce the unpredictability of the 
keystream sequence. These basic requirements are addressed with respect to the 
LlLl family of keystream generators in the remainder of this section. 

3.1 Period 

The maximum value for the period of z and the conditions under which this value 
is obtained are given in the following theorem. The result is easily obtained from 
Theorem 1 and the application of a result regarding the period of irregularly 
decimated sequences from [2]. 

Theorem 2. Let both LFSRc and LFSRd have primitive feedback polynomials 
and nonzero initial states. If 2^'^ — 1 is a prime and fd is not a constant function 
or if fd is balanced and -|- 1) — 1 is relatively prime to 2^'^ — 1 (provided 

that fc{0, ■ • ■ j 0) = 1), then the period of the output sequence z is given by the 
product Pz = (2^= — 1)(2^<^ — 1). 

Note that this period implies that each distinct initial state results in the 
production of a distinct keystream, avoiding the reduction in keyspace which 
commonly occurs in keystream generators using irregular clocking, where several 
initial states produce the same keystream [17,12]. 

3.2 Linear Complexity 

For the proposed keystream generator, the output of a nonlinear filter generator 
with period Pd = 2^<^ — 1 or a divisor of Pd is nonuniformly decimated by means 
of a sequence with period Pc = 2^“= — 1. In [5], the following upper bound on the 
linear complexity of irregularly decimated maximum-length sequences is given. 
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Table 1. Period and linear complexity of binary sequences produced by LILI keystream 
generators 
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When a maximum-length sequence of period Pd is nonuniformly decimated by 
means of a decimating sequence of period if the sum modulo Pd of Pc suc- 
cessive values of the decimating sequence equals S, then the decimated sequence 
has a maximum linear complexity of Ld ■ Pc only if the multiplicative order of 
2 modulo Pd/ gcd{Pd, S) is equal to Ld- Note that this condition is satisfied if 
gcd{Pd,S) = 1. In [5] it is also shown that if the decimating sequence is ran- 
domly chosen, then the probability that maximum linear complexity is obtained 
can be made arbitrarily close to one for appropriately chosen Ld and Pc- 

For a nonuniformly decimated nonlinearly filtered LFSR sequence, the max- 
imal attainable linear complexity is L'^- Pc, where L'^ is the linear complexity of 
the (regularly clocked) nonlinearly filtered LFSRd sequence. It is known (e.g., 
see [13]) that L'^ depends on the filter function and on the positions of stages 
used for its inputs and that L'^ is very likely to be lower bounded by (^'^), where 
r is the nonlinear algebraic order of the filter function. Accordingly, our con- 
jecture is that the linear complexity of a nonuniformly decimated nonlinearly 
filtered LFSRd sequence is very likely to be lower-bounded by ■ Pc- As a 
consequence, it is also lower-bounded by Ld - Pc- 

To investigate this conjecture, computer simulations were performed for key- 
stream generators as described in Section 2, for various small shift register 
lengths. In each case, a nonlinear 3-input balanced nonlinear Boolean function, 
with r = 2, was used as a nonlinear combining function, and the stages of LFSRd 
used for inputs to the filter function were selected to form a full positive differ- 
ence set. That is, the distances between any two stages are distinct. For each 
keystream generator, a keystream sequence of length greater than the maximum 
period of the keystream was produced and the period, Pz, and linear complexity, 
Lz, of the sequence were determined. These values are recorded in Table 1, and 
support both the theorem regarding the period and the conjecture regarding the 
linear complexity. 



3.3 Statistical Properties of Output Sequence 

Under regular clocking, one period of the sequence d produced by LFSRd when 
regularly clocked contains 2^“^“^ — 1 zeroes and 2^'^“^ ones. For a balanced filter 
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function such that /d(0, . . . , 0) = 0, a segment of length — 1 of the regularly 
clocked nonlinear filter generator output sequence g has the same distribution 
of zeroes and ones as d. When the clocking of LFSRd is under the control of 
LFSRc and when the period of z is ( 2 ^‘= — 1 )( 2 ^‘‘ — 1 ), then each pair of LFSRc 
and LFSRd states occurs exactly once in a period of z. Therefore one period of 
z contains — 1 ){ 2 ^'^~^ — 1 ) zeroes and { 2 ^° — 1 ) 2 ^'^“^ ones, thus maintaining 
the same proportion of zeroes and ones as in d. 

At a more detailed level, the choice of filter function has an effect on the 
keystream statistics. For a regularly clocked nonlinear filter generator the output 
sequence may not possess good statistics as the inputs to the filter function are 
correlated rather than independent. To guarantee good statistical properties, the 
nonlinear filter function can be chosen to be linear in either the first or the last 
variable [9]. 



3.4 Throughput Rate 



In producing the keystream, LFSRd is clocked c{t) times before z{t) is produced. 
Thus LFSRd is clocked at least once and at most 2^ times before each keystream 
bit is produced, with the distribution of values of c(t) almost uniform. Over 
one period of c, LFSRd is clocked ^(t) = 2^““ ^(2'"" + 1) — 1 times so, on 
average, LFSRd is clocked - — ^ times per keystream symbol produced. 
For large Lc, this is approximately ^ . Thus, for large Lc, the throughput 



rate is approximately 



2 '' + ! 



of the rate at which LFSRd is clocked, provided an 



appropriate buffer is used. If not, then one must allow 2^ clocks of LFSRd per 
each keystream bit. However, the use of a buffer is very sensitive in high-speed 
applications. 

Alternatively, to achieve the the maximum throughput rate of 1, instead of 
irregularly clocking the shift register a given number of steps, multiple copies of 
the feedback function can be maintained, one for each possible value of c{t). The 
irregular clocking can then be performed in one step only (both in hardware and 
software) . Thus there is a tradeoff between hardware space and timing regularity. 
Note that the use of either a buffer or parallel-feedback method would provide 
resistance against timing attacks. 



4 Possible Attacks 

A number of attacks should be considered with respect to the LILI family of 
keystream generators. These are known-plaintext attacks conducted under the 
assumption that the cryptanalyst knows the complete structure of the generator, 
and the secret key is only the initial states of the component shift registers. For 
all attacks, the given keystream is viewed as an irregularly decimated version of 
a nonlinearly filtered LFSRd sequence, with the decimation under the control 
of LFSRc- For keystream generators based on more than one LFSR where the 
key consists of the initial states of the LFSRs, such as the LILI generators. 
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divide-and-conquer attacks on individual LFSRs should be considered. We deal 
firstly with divide-and-conquer attacks that target LFSRd, and then with those 
attacks that target LFSRc- 

4.1 Attacks on Irregularly Clocked LFSRd 

Suppose a keystream segment of length N is known, say This is a 

decimated version of a segment of length M of the underlying regularly clocked 
nonlinearly filtered LFSRd sequence, g = where M > N. The ob- 

jective of correlation attacks targeting LFSRd is to recover the initial state of 
LFSRd by identifying the segment {g(i)}^i that was obtained from 

through decimation, using the correlation between the regularly clocked sequence 
and the keystream, without knowing the decimating sequence. 

For clock-controlled shift registers with constrained clocking, correlation at- 
tacks based on a constrained Levenshtein distance and on a probabilistic mea- 
sure of correlation are proposed in [6] and [7], respectively, and further analysed 
in [8]. These attacks could be adapted to be used as the first stage of a divide- 
and-conquer attack on the LILI keystream generators. 

For a candidate initial state of LFSRd, say {d(*)}^i, use the known LFSRd 
feedback function to generate a segment of the LFSRd sequence, , 

for some M > Ld- Then use the known filter function fd to generate a seg- 
ment of length M of the output of the nonlinear filter generator when regularly 
clocked, {g{i)}iti- A measure of correlation between {g{i)}fti and {z{t)}^i 
is calculated, (either the Constrained Levenshtein Distance (CLD) [6], or the 
Probabilistic Constrained Edit Distance (PCED) [7]) and the process repeated 
for all LFSRd initial states. 

In either case, the attack is considered successful if only a few initial states 
are identified. As the correlation attack based on the PCED takes into account 
the probability distribution of the decimating sequence, it is statistically optimal 
and may be successful in cases where the embedding attack based on the CLD 
is not, such as for larger values of k. The value of M is a function of N and k. If 
M = 2^ X N, then the probability of not identifying the correct LFSRd initial 
state is zero. 

The second stage of a divide-and-conquer attack on the generator is the 
recovery of the initial state of the second shift register. This can be performed 
as in [17]. From the calculation of the edit distance (either CLD or PCED) 
between {g{i)}iti and {z{t)}^^^, form the edit distance matrix, and use this to 
find possible edit sequences. From each possible edit sequence, form a candidate 
integer sequence {c(t)}(^i. From this, the underlying binary sequence {a(t)}H]^ 
and hence the candidate initial state of LFSRc can be recovered. To determine 
whether the correct initial states of both LFSRs have been recovered, use both 
candidate initial states to generate a candidate keystream and compare it with 
the known keysteam segment. 

To conduct either of these correlation attacks requires exhaustive search 
of LFSRd initial states. For each LFSRd initial state, the attacks require 
calculation of either the CLD or the PCED, with computational complexity 
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0{N{M — N)). Finally, further computational complexity is added in finding 
the corresponding LFSRc initial state. For either correlation attack, the min- 
imum length of keystream required for a successful attack on LFSRd is linear 
in Ld, but exponential or even superexponential in 2^ (see [8]). For fc = 1, the 
required keystream length [22] is reasonably small, but a small increase in k will 
render this length prohibitively large. 

4.2 Attacks Targeting LFSRc 

A possible approach to attacking the proposed generator is by targeting the 
clock-control sequence produced by LFSRc- Guess an initial state of LFSRc, 
say Use the known LFSRc feedback function and the function fc to 

generate the decimating sequence {c(t)}(^i for some N > Lc- Then position 
the known keystream bits in the corresponding positions of {3(*)}^i, 

the nonlinear filter generator output when regularly clocked. At this point we 
have some (not all consecutive) terms in the nonlinear filter generator output 
sequence and are trying to reconstruct a candidate initial state for LFSRd- The 
attack could then proceed in several ways. 

Consistency Attack. One method is to use the known filter function fd to 
write equations relating terms in the underlying LFSRd sequence to terms in 
{g(i)}^i. Reject the guessed initial state when the equations are incon- 

sistent. This is a generalisation of the linear consistency test [21]. The feasibility 
of such an approach depends on the number, n, of inputs to fd, on the tap posi- 
tions producing these inputs and on some properties of fd such as its nonlinearity 
and order of correlation immunity. 

Attacks on Regularly Clocked LFSRd- An alternative approach would be 
to use a correlation attack on the nonlinear filter generator [14] to recover a linear 
transform of the LFSRd sequence, and then recover the LFSRd initial state. 
However, this is complicated by not having consecutive terms in the regularly 
clocked nonlinear filter generator sequence. The feasibility of such an attack 
primarily depends on the use of a feedback polynomial of LFSRd that is of low 
weight or has low- weight polynomial multiples and on the nonlinearity of fd- 

An alternative correlation attack on a (regularly clocked) nonlinear filter 
generator which could be applied at this point is the conditional correlation 
attack [1], with a difference that the known output bits are not consecutive. The 
feasibility of such an attack depends on n and on the tap positions. The use of a 
full positive difference set for the tap positions, as suggested in [9], and of filter 
functions with correlation-immunity order greater than zero would render this 
attack infeasible. 

Finally, the inversion attack [9] can be adapted to deal with the case of non- 
consecutive output bits, but the associated branching process is then supercriti- 
cal, because more than one bits have to be guessed at a time. As a consequence, 
the computational complexity may be prohibitively high even if the tap positions 
are not spread across the LFSRd length. 
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Applying any of these approaches requires exhaustive search over the LFSRc 
initial state space and additional computation for each candidate LFSRc state. 
However, as only some (not all consecutive) terms in the nonlinear filter genera- 
tor output sequence are available, the required additional computation appears 
to be prohibitive, especially for highly nonlinear filter functions with a large 
number of inputs and sufficiently high correlation-immunity order, for the tap 
positions chosen according to a full positive difference set and for the feedback 
polynomial of LFSRd not having low- weight polynomial multiples of relatively 
small degrees. 

5 Choice of Parameters 

As an initial security consideration, we should choose the sizes of the shift reg- 
isters so that exhaustive search of the initial states is prohibitive; at present we 
recommend that Lc + Ld > 100. For a keysize in line with the AES specifications 
for block ciphers, use Lc + Ld = 128. To prevent divide-and-conquer attacks, 
neither Ld nor Lc should be small. To ensure a large period and good statisti- 
cal properties, the feedback polynomials of both LFSRc and LFSRd should be 
primitive. In addition, as noted in Section 3, for generator parameters satisfying 
the conditions of Theorem 2, the period of the output sequence 2 : attains a 
maximum value of (2^“ — 1)(2'^‘^ — 1), implying that every initial state generates 
a distinct keystream. Furthermore, the selection of parameters should reduce the 
possibility of the attacks discussed in Section 4. We address each subsystem of 
the keystream generators in turn. 

5.1 Clock Control 

The number, k, of taps from LFSRc used to form the clocking sequence c affects 
the period of the output sequence and the resistance against the correlation 
attacks on irregularly clocked LFSRd, described in Section 4.1, and is the sole 
factor determining the output rate of the generator. To this end, we recommend 
fc > 1 (e.g., /c = 2 or fc = 3). The choice of the tap positions does not seem 
to be important with respect to known attacks, but to be on the safe side, we 
recommend the use of full positive difference sets. 

Also, if the conditions of Theorem 2 are not satisfied, then the period of z is 
upper-bounded by the product of the period of c and any factors of the period 
of the nonlinear filter generator output (if regularly clocked) which are relatively 
prime to 2'^‘=“^(2^ -|- 1) — 1. Thus, for any chosen value of k, gcd(2^'=“^(2^ + 1) ~ 
1,2^*^ — 1) should be calculated, and the keystream period is maximised when 
this is one. 

5.2 Data Generation 

Firstly, the feedback polynomial of LFSRd should not have low- weight polyno- 
mial multiples of relatively small degrees, in order to avoid the vulnerability to 
fast correlation attacks on LFSRd when regularly clocked. 
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Secondly, the number, n, and positions of taps for the filter function, fd, 
should be chosen so as to ensure the resistance to attacks discussed in Section 
4.2. For example, we recommend that n > 10 and that the tap positions form a 
full positive difference set if possible. 

Thirdly, the filter function, fd, should be balanced in order to achieve good 
statistical properties and a large period (Theorem 1). 

Fourthly, fd should be chosen so as to reduce the possibility of attacks dis- 
cussed in Section 4.2 (especially if fc = 1). To this end, fd should have high 
correlation-immunity order and high nonlinearity. The proportion of balanced 
Boolean functions which offer any nonzero order of correlation immunity is small, 
making it unlikely that a randomly generated function will meet these criteria. 
Instead, a filter function should be constructed to obtain the required properties. 
Since there are tradeoffs between nonlinearity, correlation-immunity order and 
algebraic order, we seek functions that optimise these bounds. 

In [15], it was proven that balanced Boolean functions exist with 10 inputs, 
correlation-immunity order 3, algebraic order 6 and nonlinearity 480. In the 
same paper a function with CI(1), algebraic order 8 and nonlinearity 484 was 
constructed. Both of these Boolean functions maximise the Siegenthaler tradeoff 
and they have the highest possible nonlinearity for their given order of correlation 
immunity, so either would be a good choice for the output function fd- For our 
example, we choose a CI(3) function as we believe that gives a greater resistance 
to conditional correlation attacks. 

6 Example 

For a 128-bit key, we select the lengths of LFSRc and LSFRd to be 39 and 
89, respectively. The feedback polynomials of both LFSRc and LFSRd are the 
primitive polynomials of degrees 39 and 89, respectively, listed in the Appendix. 

For the clock-control subsystem, the length of LFSRc is Ac = 39, from which 
k = 2 bits are selected to determine the number of data clocks by the natural 
mapping: fc{xi,X 2 ) = 1 -I- -I- 2 x 2 - 

For the data-generation subsystem, we let n = 10. Now, we have Ld > 80 
and this permits the positions of inputs to fd to form a full positive difference 
set, shown in the Appendix. Also, we select fd from [15] to be a balanced, CI(3) 
function of 10 inputs, with nonlinearity 480 and algebraic order r = 6 (see the 
Appendix for the truth table). 



6.1 Properties 

As the feedback polynomial of LFSRd is primitive, fd is balanced and in addition 

289 _i 

is a Mersenne prime, the conditions of Theorem 2 are satisfied. Thus the 
period of the keystream is = (2^® — 1)(2®® — 1). According to Section 3.2, 
the linear complexity of the keystream sequence is conjectured to be at least 
i^r) ■ “ (e^) ■ — 1) Ri 2®®. With regard to the security offered by this 

value, we note that this means that about 2®® known plaintext bits must be 
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intercepted in order to perform the Berlekamp-Massey [10] attack. As the key 
will be changed well before even a fraction of this amount of data is generated, 
LILI is considered to be secure from such an attack. 

6.2 Possible Attacks 

Both the period and the conjectured linear complexity of the keystream are too 
large to be used in cryptanalytic attacks. 

The choice of parameters for the data-generation subsystem, in particular the 
Boolean function fd, make attacks targetting LFSRc, outlined in Section 4.2, 
infeasible. In [14], fast correlation attacks on regularly clocked nonlinear filter 
generators with low-weight feedback polynomials and a known keystream seg- 
ment of 20,000 bits were not successful when the probability of noise, p, exceeded 
0.45. The computational complexity of these attacks is proportional to the length 
of keystream used and the average number of parity checks used per keystream 
bit. For the assumed function fd, the probability of noise is given asp = 0.46875, 
so that the amount of keystream required would be much greater than 20,000 
bits. This is likely to make the complexity of an attack on a regularly clocked 
nonlinear filter generator prohibitive, even if enough low-weight polynomial mul- 
tiples of the LFSRd feedback polynomial, used to form parity checks, could be 
obtained. Given that the keystream segment is from a clock-controlled nonlinear 
filter generator and that the LFSRd feedback polynomial does not have low- 
weight polynomial multiples of relatively small degrees, such an attack appears 
infeasible. 

The length of LFSRd makes attacks targetting LFSRd, outlined in Sec- 
tion 4.1, infeasible as these attacks require exhaustive search of the initial states 
of LFSRd, performing some calculation of the correlation for each state. The 
complexity of such attacks is 0((2®® — 1)(3A^)), where the required length of 
the known keystream, N, is very likely to be very large even for k = 2. In 
[17], successful probabilistic correlation attacks were performed on the shrinking 
generator for given keystream lengths of twenty times the length of the under- 
lying LFSR. The deletion rate for this example is similar, so an estimate of the 
complexity of these attacks is 0(2^^^). 

7 Conclusion 

In this paper, a family of keystream generators, intended for use in stream cipher 
applications, is proposed. The design is both simple and scalable: the generators 
are based on two binary LFSRs and use two combining functions. The security 
of these keystream generators is investigated. For appropriately chosen compo- 
nents, the generators are shown to provide the basic security requirements for 
cryptographic sequences, such as a long period and high linear complexity. Also, 
they are immune to current known-plaintext attacks, conducted under the as- 
sumption that the cryptanalyst knows the entire structure of the generator and 
the secret key is only the initial states of the two LFSRs. 
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To select an instance from the proposed family, it is necessary to select ap- 
propriate values for Lc, Ld, k and n, to have primitive feedback polynomials of 
both LFSRs and a highly nonlinear balanced Boolean function of an appropriate 
correlation-immunity order for the filter function. The selection of components 
can maximise the period and minimise the chances of a successful cryptanalytic 
attack. The use of both nonlinear combining functions and irregular clocking in 
LFSR based stream ciphers is not a novel proposal, and has been employed in 
previous constructions. However, in this proposal the two approaches are com- 
bined in a manner that produces output sequences with provable properties with 
respect to basic cryptographic security requirements and also provides security 
against currently known cryptanalytic attacks. 



Appendix 



Full Details of Example LILI with 128 Bit Key 

The LFSRs have these feedback polynomials: 

LFSRc : + x^^ + -k x^’' + x^^ + x^'^ + x"^ + I 

LFSRd : X®® -k X®® -k x®° -k x®® -k x®® -k x^^ -k x®® -k x -k 1. 



The two inputs xi,X 2 to fc are taken from LFSRc positions 12 and 20, where 
the range is [0, 38]. 

The 10 inputs to fd are taken from LFSRd positions according to this full 
positive difference set: (0, 1, 3, 7, 12, 20, 30, 44, 65, 80). 

The truth table of the output function fd'. 



0 , 0 , 1 , 1 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 0 , 0 , 1 , 1 , 1 , 1 , 0 , 0 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 1 , 1 , 1 , 0 , 0 , 

0 , 0 , 1 , 1 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 0 , 0 , 1 , 1 , 1 , 1 , 0 , 0 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 1 , 1 , 1 , 0 , 0 , 

0 , 0 , 1 , 1 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 0 , 0 , 1 , 1 , 1 , 1 , 0 , 0 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 1 , 1 , 1 , 0 , 0 , 

1 , 1 , 0 , 0 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 1 , 1 , 1 , 0 , 0 , 0 , 0 , 1 , 1 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 0 , 0 , 1 , 1 , 

0 , 1 , 0 , 1 , 1 , 0 , 1 , 0 , 1 , 0 , 1 , 0 , 0 , 1 , 0 , 1 , 1 , 0 , 1 , 0 , 0 , 1 , 0 , 1 , 0 , 1 , 0 , 1 , 1 , 0 , 1 , 0 , 

0 , 1 , 0 , 1 , 1 , 0 , 1 , 0 , 1 , 0 , 1 , 0 , 0 , 1 , 0 , 1 , 1 , 0 , 1 , 0 , 0 , 1 , 0 , 1 , 0 , 1 , 0 , 1 , 1 , 0 , 1 , 0 , 

0,1,0,1,1,0,1,0,1,0,1,0,0,1,0,1,1,0,1,0,0,1,0,1,0,1,0,1,1,0,1,0, 

1,0,1,0,0,1,0,1,0,1,0,1,1,0,1,0,0,1,0,1,1,0,1,0,1,0,1,0,0,1,0,1, 

0 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 0 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 

0 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 0 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 

0 , 1 , 1 , 0 , 1 , 0 , 0 , 1 , 0 , 1 , 1 , 0 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 0 , 1 , 1 , 0 , 1 , 0 , 0 , 1 , 0 , 1 , 1 , 0 , 

0 , 1 , 1 , 0 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 0 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 0 , 1 , 1 , 0 , 

0 , 0 , 0 , 0 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 0 , 0 , 0 , 0 , 1 , 1 , 1 , 1 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 1 , 1 , 1 , 1 , 

1 , 1 , 1 , 1 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 1 , 1 , 1 , 1 , 0 , 0 , 0 , 0 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 0 , 0 , 0 , 0 , 

0 , 0 , 1 , 1 , 0 , 0 , 1 , 1 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 1 , 

1 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 1 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 

0 , 0 , 1 , 1 , 1 , 1 , 0 , 0 , 0 , 0 , 1 , 1 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 0 , 0 , 1 , 1 , 1 , 1 , 0 , 0 , 0 , 0 , 1 , 1 , 

1 , 1 , 0 , 0 , 0 , 0 , 1 , 1 , 1 , 1 , 0 , 0 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 1 , 1 , 1 , 0 , 0 , 0 , 0 , 1 , 1 , 1 , 1 , 0 , 0 , 

0 , 0 , 1 , 1 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 1 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 0 , 0 , 1 , 1 , 
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1 , 1 , 0 , 0 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 1 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 1 , 1 , 1 , 0 , 0 , 
0 , 1 , 0 , 1 , 0 , 1 , 0 , 1 , 1 , 0 , 1 , 0 , 1 , 0 , 1 , 0 , 1 , 0 , 1 , 0 , 1 , 0 , 1 , 0 , 0 , 1 , 0 , 1 , 0 , 1 , 0 , 1 , 

1 , 0 , 1 , 0 , 1 , 0 , 1 , 0 , 0 , 1 , 0 , 1 , 0 , 1 , 0 , 1 , 0 , 1 , 0 , 1 , 0 , 1 , 0 , 1 , 1 , 0 , 1 , 0 , 1 , 0 , 1 , 0 , 
0 , 1 , 0 , 1 , 1 , 0 , 1 , 0 , 0 , 1 , 0 , 1 , 1 , 0 , 1 , 0 , 1 , 0 , 1 , 0 , 0 , 1 , 0 , 1 , 1 , 0 , 1 , 0 , 0 , 1 , 0 , 1 , 

1 , 0 , 1 , 0 , 0 , 1 , 0 , 1 , 1 , 0 , 1 , 0 , 0 , 1 , 0 , 1 , 0 , 1 , 0 , 1 , 1 , 0 , 1 , 0 , 0 , 1 , 0 , 1 , 1 , 0 , 1 , 0 , 
0 , 1 , 0 , 1 , 1 , 0 , 1 , 0 , 1 , 0 , 1 , 0 , 0 , 1 , 0 , 1 , 0 , 1 , 0 , 1 , 1 , 0 , 1 , 0 , 1 , 0 , 1 , 0 , 0 , 1 , 0 , 1 , 

1 , 0 , 1 , 0 , 0 , 1 , 0 , 1 , 0 , 1 , 0 , 1 , 1 , 0 , 1 , 0 , 1 , 0 , 1 , 0 , 0 , 1 , 0 , 1 , 0 , 1 , 0 , 1 , 1 , 0 , 1 , 0 , 
0 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 

1 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 0 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 
0 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 0 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 

1 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 0 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 0 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 
0 , 1 , 1 , 0 , 1 , 0 , 0 , 1 , 0 , 1 , 1 , 0 , 1 , 0 , 0 , 1 , 0 , 1 , 1 , 0 , 1 , 0 , 0 , 1 , 0 , 1 , 1 , 0 , 1 , 0 , 0 , 1 , 

1 , 0 , 0 , 1 , 0 , 1 , 1 , 0 , 1 , 0 , 0 , 1 , 0 , 1 , 1 , 0 , 1 , 0 , 0 , 1 , 0 , 1 , 1 , 0 , 1 , 0 , 0 , 1 , 0 , 1 , 1 , 0 . 

This Boolean Function has 10 inputs and these properties: balanced, CI(3), 
algebraic order 6, nonlinearity 480, no linear structures. 
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Abstract. It has recently been shown that when m > — 1, the 

nonlinearity Nf of an mth-order correlation immune function / with 
n variables satisfies the condition of Nf < 2”“^ — 2”*, and that when 
m > |n — 2 and / is a balanced function, the nonlinearity satisfies 
Nf < — 2"*+^. In this work we prove that the general inequality, 

namely Nf < 2"-^ - 2"*, can be improved to Nf < 2"-^ - 2"*+^ for 
m > 0.6n — 0.4, regardless of the balance of the function. We also show 
that correlation immune functions achieving the maximum nonlinearity 
for these functions have close relationships with plateaued functions. The 
latter have a number of cryptographically desirable properties. 

Key words: Correlation Immune Functions, Nonlinearity, Resilient 
Functions, Plateaued Functions, Stream Ciphers 



1 Introduction 

Correlation immunity has long been recognized as one of the critical indicators 
of nonlinear combining functions of shift registers in stream generators (see [12] ) . 
A high correlation immunity is generally a very desirable property, in view of 
various successful correlation attacks against a number of stream ciphers (see 
for instance [6]). 

Another class of cryptanalytic attacks against stream ciphers, called best 
approximation attacks, were advocated in [4]. Success of these attacks in breaking 
a stream cipher is made possible by exploiting the low nonlinearity of functions 
employed by the cipher, and it highlights the significance of nonlinearity in the 
analysis and design of encryption algorithms. 

Recently Sarkar and Maitra [10] have proved that when m > — 1, the 

nonlinearity Nf of an mth-order correlation immune function / with n variables 
satisfies the condition of Nf < 2"“^ — 2™. In addition they have shown that if 
/ is balanced and m > — 2, then the condition becomes Nf < 2"“^ — 2"*+^. 

(See also Section 8 for independent efforts by researchers other than Sarkar and 
Maitra.) 

In this work we focus our attention on the case of m > 0.6n — 0.4. We show 
that for such m and n, the nonlinearity of an mth-order correlation immune 
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function / with n variables must satisfy the condition of Nf < 2"“^ — 2"*+^, 
regardless of the balance of the function. This represents an improvement on the 
upper bound of Nf < 2”“^ — 2"*. 

Plateaued functions are a new class of functions recently introduced in [16]. 
These functions have a number of properties that are deemed desirable in cryp- 
tography. We show that, interestingly, a correlation immune function with the 
maximum nonlinearity achievable by such a function can be identified with a 
plateaued function. This provides a new avenue for the analysis and design of 
cryptographically useful correlation immune functions. 

The remaining part of this paper is organized as follows: Section 2 introduces 
basic definitions on Boolean functions, and Section 3 summarizes some of the 
important cryptographic criteria for Boolean functions. This will be followed by 
Section 4 where relevant properties of plateaued functions are discussed. Some 
useful results on correlation immune functions are introduced in Section 5. These 
results will then be used in Section 6 where our improved upper bound on the 
nonlinearity of correlation immune functions is proved. In the same section some 
relationships between correlation immune functions and plateaued functions are 
also examined. In Section 7, the new upper bound is demonstrated to be tight for 
balanced correlation immune functions. Finally the paper is closed by Section 8 
where possible directions for future research are pointed out. 



2 Boolean Functions 



We consider functions from Vn to GF{ 2 ) (or simply functions on Vn), where Vn 
is the vector space of n tuples of elements from GF{ 2 ). The truth table of a 
function / on Vn is a (0, l)-sequence defined by (/(ap), /(oi), . . . , /(a2"-i)), 
and the sequence of / is a (1, — l)-sequence defined by ((— (— 

..., where op = (0, ...,0, 0), ax = (0,...,0,1), . . ., a2"-i = 

(1 , . . . , 1, 1). The matrix of / is a (1, — l)-matrix of order 2" defined by M = 
((— where © denotes the addition in A function / is said to be 
balanced if its truth table contains an equal number of ones and zeros. 

Given two sequences a = (ai, • • • , am) and b = (6i, • • • , bm), their component- 
wise product is defined by a * 6 = (ai&i, • • • , ambm)- In particular, if m = 2" and 
a, b are the sequences of functions / and g on Vn respectively, then d* b is the 
sequence of / © 5 where © denotes the addition in GF{ 2 ). 

Let a = (ax, - ■ ■ ,am) and b = (61, •••,6^) be two sequences or vectors, 
the scalar product of a and b, denoted by (a, &), is defined as the sum of the 
component-wise multiplications. In particular, when d and b are from Vm, (a, b) = 
axbx © • • • © ambm, where the addition and multiplication are over GF{ 2 ), and 
when d and b are (1, — l)-sequences, (d,b) = where the addition and 

multiplication are over the reals. 

An affine function / on I4i is a function that takes the form of f{xx, ■ ■ ■ , Xn) = 
axXx © • • • © ttnXn © c, where aj,c € GF{ 2 ), j = 1, 2 , . . . , n. Furthermore / is 
called a linear function if c = 0. 
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A (1, — l)-matrix N of order n is called a Hadamard matrix if = n/„, 

where is the transpose of N and is the identity matrix of order n. A 
Sylvester-Hadamard matrix of order 2", denoted by is generated by the 
following recursive relation 



Ho = 1 , Hn 



Hji—i H^i—i 
Hji—i Hji—i 



n = 1, 2, .... 



Obviously Hn is symmetric. Let < i < 2" — 1, be the i row of It is known 
that is the sequence of a linear function (pi{x) defined by the scalar product 
(pi{x) = (ai,x), where ai is the *th vector in Vn according to the ascending 
alphabetical order. 

The Hamming weight of a (0, l)-sequence denoted by HW{^), is the num- 
ber of ones in the sequence. Given two functions / and g on Vn, the Hamming 
distance d{f, g) between them is defined as the Hamming weight of the truth 
table of f{x) 0 g{x), where x = (xi, . . . , x„). 



3 Cryptographic Criteria of Boolean Functions 

The following criteria for cryptographic Boolean functions are often considered: 
balance, nonlinearity, propagation criterion, correlation immunity, algebraic de- 
gree and non-zero linear structures. In this paper we focus mainly on nonlinearity 
and correlation immunity. 

The so called Parseval’s equation (Page 416 [7]) is a useful tool in this work: 
Let / be a function on Vn and ^ denote the sequence of /. Then X)i=o ~ 

2^” where ii is the tth row of Hn, i = 0, 1, . . . , 2” — 1. 

The nonlinearity of a function / on Vn, denoted by N f, is the minimal Ham- 
ming distance between / and all affine functions on Vn, i.e., 

Nf = miuj^i 2 ,..., 2 "+i d{f, ipi) where ilii, ip 2 , ■ ■ ■, ■i/' 2 ”+i ^'I'e all the affine functions 
on Vn- High nonlinearity can be used to resist a linear attack. The following char- 
acterization of nonlinearity will be useful (for a proof see for instance [8]). 

Lemma 1. The nonlinearity of f on Vn can be expressed by 

Nf = 2"-i - imax{|(e,£,)|,0 < z < 2" - 1} 

where ^ is the sequence of f and io, . . £ 2 "-! are the rows of Hn, namely, the 
sequences of linear functions on Vn- 

From Lemma 1 and Parseval’s equation, it is easy to verify that Nf < 2"“^ — 
25 n-i function / on I^. If fVy = 2"“^ — 2i”“^, then / is called a bent 

function [9]. It is known that a bent function on Vn exists only when n is even. 

Let / be a function on Vn- For a vector a £Vn, denote by ^(a) the sequence 
of f{x 0 a). Thus ^(0) is the sequence of / itself and ^(0) * ^(a) is the sequence 
of /(x) 0/(x0a). Set A/(a) = (^(0), ^(a)), the scalar product of ^(0) and ^(a). 
A{a) is called the auto-correlation of / with a shift a. We omit the subscript of 
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Z\/(a) if no confusion occurs. Obviously, A{a) = 0 if and only if f{x) ©/(a; ©a) 
is balanced, i.e., / satisfies the propagation criterion with respect to a. In the 
case that / does not satisfy the propagation criterion with respect to a vector 
a, it may be desirable for f{x) © f{x © a) to be almost balanced. That is, one 
may require |Z\/(a)| to be a small value. 

The concept of correlation immune functions was introduced by Siegenthaler 
[12]. Xiao and Massey gave an equivalent definition [2,5]: A function / on is 
called a mth-order correlation immune function if 

xeVn 

for all (3 £Vn with 1 < HW{f3) < m, where in the the sum, f{x) and (/3,x) are 
regarded as real- valued functions. From the first equality in Section 4.2 of [2], 
a correlation immune function can also be equivalently restated as follows: Let 
/ be a function on and let ^ be its sequence. Then / is called a mth-order 
correlation immune function if (^, i) = 0 for every i, where i is the sequence 
of a linear function ip{x) = {ot,x) on Vn constrained by 1 < HW(a) < m. In 
fact, = 0, where ii is the ith row of if and only if f{x) © (ai,x) is 

balanced, where is the binary representation of an integer i, 0 < i < 2" — 1. 

Correlation immune functions are used in the design of running-key generators 
in stream ciphers to resist a correlation attack and the design of hash functions. 
Relevant discussions on correlation immune functions, more generally on resilient 
functions, can be found in [15]. 

Let / be a function on and f denote the sequence of /. We introduce two 
new notations: 

1. Set 3/ = {f I (^, yf 0, 0 < i < 2” — 1} where ii is the fth row of 

2. set I {i.icti) 0, 0 < i < 2" — 1} where Ui is the binary 

representation of an integer i, 0<i<2" — 1 and ia.i is identified with ii. 

3^ is essentially the same as 5/ with the only difference being that its ele- 
ments are represented by a binary vector in We will simply write S/ as S 
and as Q* when no confusion arises. It is easy to verify that #3/ and #3} 
are invariant under any nonsingular linear transformation on the variables of the 
function /. #3/ (#Sp together with the distribution of 3/ (5j) determines 
the correlation immunity and other cryptographic properties of a function. 

4 An Overview of Plateaued Functions 

The concept of plateaued functions was introduced in [16]. 

Definition 1. Let f he a function on Vn and ^ denote the sequence of f. If there 
exists an even number r, 0 < r < n, such that ffH = 2’’ and each (^, ij)"^ takes the 
value 0 / 2 ^”“’' or 0 only, where ij denotes the jth row of Hn, j = 0, 1, . . . , 2” — 1, 
then f is called a rth-order plateaued function on Vn- f is also simply called a 
plateaued function on Vn if we ignore the particular order r. 
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Due to Parseval’s equation, the condition that #3 = 2’’ can be obtained 
from the condition that “each takes the value of or 0 only, where 

£j denotes the jth row of j = 0, 1, . . . , 2” — 1” . For the sake of convenience, 
however, we have mentioned both conditions in the definition of plateaued func- 
tions. 

Some facts about plateaued functions follow: (1) if / is a rth-order plateaued 
function, then r must be even, (2) / is an nth-order plateaued function if and 
only if / is bent, (3) / is a Oth-order plateaued function if and only if / is affine. 
All the following results can be found in [16] . 

Theorem 1. Let f be a funetion on Vn and ^ denote the sequence of f . Set 
Pm = niax{|(^,£j)|, j = 0, 1, . . . , 2" — 1}, where £j is the jth row of Hn- Then 
the following statements are equivalent: (i) f is a plateaued funetion on Vn, (U) 
A^{aj) = (Hi) the nonlinearity Nf of f satisfies Nf = 2"“^ — 

(iv) PmV¥^ = 2", (v) Nf = 2"-i - A^aj). 



Theorem 2. Let f he a funetion on Vn and ^ denote the sequence of f. Then 

^ ~f o3n 

E > N 

j=o ^ 

where the equality holds if and only if f is a plateaued funetion. 



Theorem 3. Let f be a function on Vn and f denote the sequence of f. Then 
the nonlinearity Nf of f satisfies Nf < 2"“^ — where the equality holds if 

and only if f is a plateaued function. 



Theorem 4. Let f be a function on Vn and f denote the sequence of f. Then 
the nonlinearity Nf of f satisfies 



Nf < 2"-i 



2^-1 




E 

j=0 



where the equality holds if and only if f is a plateaued function on Vn. 



Proposition 1. Let f be a rth-order plateaued function on Vn- Then the non- 
linearity N f of f satisfies Nf = 2"“^ — 



5 Some Useful Results on Correlation Immnne Functions 

Consider a function / on Vn. Denote by ^ = (oq, oi, . . . , a 2 "-i), where aj = ±1, 
the sequence of /. Obviously 



(ao,oi, . . . ,a2n-l)i^I^ = {{£,,£ 0 ), {£,£ 1 ), ■ ■■ , {£,£ 2 «-- i )) 



( 1 ) 
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where li is the ith row of i = 0, 1, . . . , 2" — 1. 

Let p be an integer with 1 < p < n — 1. Rewrite (1) as 

(no, ai, . . . , a 2 "-i)(i?p x = ((C, 4), (^, ^i), • • - , (?, ^ 2 --i)) (2) 

where x is the iLronecfcer Product [14]. 

Let 6i denote the ith row of Hn-p, z = 0, 1, . . . , 2"“^ — 1. For any fixed j with 
0 < j < 2"“^’ — 1, comparing the jth, the (j+2"“P)th, . . the (j + (2^’ — l)2"“^)th 
terms in the two sides of (2), we have 

(ao,oi, . . . ,a 2 "-i)(iLp x ej) = • i (Ci ^i+( 2 P-i) 2 "-p)) 

Write ^ = (^ 0 : Ci) ■ • • j ^ 2 P-i) where each is of length 2"“^. Then we have 

((^0, Sj), (Cl, Gj), . . . , {^2P-i,ej))Hp = (C,-^i+2"-p), ■ ■ ■ , (C,^i+(2p-i)2p-p)) 

Hence 

2^((Co, Gj), (Cl, Gj), ■ ■ ■ , (C 2 P- 1 ,Gj)) 

= ((C , ^i), (C,^j+2"-p), ■ • • , i^y£j+{2P-l)2P-p))Hp (3) 

Based on these discussions, we have the following lemma. 

Lemma 2. Let f be an mth-order correlation immune function on where 
m < n — 2, and C be the sequence of f. Then (C,^ 2 P‘+i-i) = 0 (mod 2"*+^) if 
and only z/(C,^o) = 0 (mod 2"*+^) where £q is the top row of Hn- 

Proof. Set W = {ao, , Q! 2 - 2 p-'p-i , • ■ ■ , Q!( 2 P‘+i-i) 2 p-™-i}, where each aj 

is the binary representation of an integer j. Note that W is an (m+l)-dimensional 
linear subspace of Vn- 

Write C = (Co,Ci,C 2 , C 2 ™+i-i), where each Ci is of length 2"“'"“^. Let 

p = m + 1 and j = 0 in (3), we have 

2 m+l((c^, (Cl, eo), . . . , (C 2 ™+i-i, Go)) 

= ((C,-^o), (C,^ 2 "- 1), (C,^ 2 - 2 P-'P-i), • ■ ■ , (C,^( 2 ”‘+i-l) 2 P-'P-i))^m+l (4) 
where eo denotes the 0th row of Hn-m-i, i-e., the all-one sequence of length 

As HW{aj. 2 P-m-i) < rn, we have (C, t'j. 2 "-'p-i) = 0, where j = 1, . . . , 2™+^ — 
2. Therefore (4) can be rewritten as 

2™~'’^((Co, Go), (Cl, Go), • ■ • , (C 2 "*+ 1 - 1 , Go)) 

= ((C, 4), 0, . . . , 0, (C, £(2™+l_l)2n— l))i/™+l (5) 

Comparing the rightmost term in the two sides of (5), we have 

2"^~''1(C2p*+ 1-1, Go) = (C,'^o) ~ (C, •^( 2 "‘+i-l) 2 "-™-i) (6) 

Note that the length of C 2 "*+i-i and Cq is even. Hence (C 2 'p+i-i, Gq) must be 
even. From this it follows that 2'”+^(C2p>+i-i, Go) = 0 (mod 2"*+^). Finally, by 
considering (6), we have proved that (C, ^i' 2 m+i_i'i 2 "-'p-i) = 0 (mod 2™+^) if 

a„do„lyif(£,f„)^0 (ld2".«). □ 
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By choosing a different W in the proof of Lemma 2, we can prove the following 
lemma in a similar way. 

Lemma 3. Let f be an mth-order eorrelation immune funetion on Vn, where 
m < n — 2, and ^ be the sequence of f . Let jo be an integer satisfying 0 < jo < 
2"— 1 and HW(ajg) = m+l, where is the binary representation of the integer 
jo- Then {fjijg) = 0 (mod 2™+^) if and only if (fjio) = 0 (mod 2™+^). 

Lemma 3 allows us to claim 

Lemma 4. Let f be an mth-order correlation immune function on Vn, where 
m < n — 2, and ^ be the sequence of f. Let jo be an integer satisfying 0 < jo < 

2” — 1 and HW{ajg) = m + 1. If = 0 (mod 2"^+^) then = 0 

(mod 2™+^) for any integer j satisfying HW{aj) = m + 1, where aj is the 
binary representation of j . 

The condition of HW{aj) = m + 1 in the lemma above can be removed, as 
is shown below. 

Lemma 5. Let f be an mth-order correlation immune function on Vn, where 
m < n — 2, and ^ be the sequence of f . Let jo be an integer satisfying 0 < jo < 

2” — 1 and HW{ajg) = m + 1, where ajg is the binary representation of jo- If 

= 0 (mod 2™+^), then = 0 (mod 2™+^) for any row £i of Hn- 

Proof- We use induction on HW{aj) to prove that {^,£j) = 0 (mod 2"*+^), 
where aj is the binary representation of the subscript j of £j- 

For 0 < HW{aj) < m, since / is an mth-order correlation immune function, 
we have {f,£j) = 0. On the other hand, from Lemma 4, we have {f,£j) = 0 
(mod 2™+^), where £j is any row of Hn satisfying HW{aj) = m -\- 1, and aj 
is the binary representation of j. Due to Lemma 3, we also have {^,£o) = 0 
(mod 2™+^). Hence we have proved {^,£j) = 0 (mod 2™+^), when HW{aj) < 
m-\- 1- 

Now assume that (^,£j) = 0 (mod 2™+^), when m -h 1 < HW{aj) < k < 
n — I- Consider the case of HW{aj) = k -\- 1- Obviously, W can be rewritten as 
W = {ao, 02 "-'^-! ) • J where each aj is the binary 

representation of an integer j. One can see that VF is a (fc-l-l)-dimensional linear 
subspace. 

Let ^ = (^o,CijC 2 , where each is of length 2"“^“^. Further- 

more, let p = fc -|- 1 and j = 0 in (3). Then we have 

2^"'’^((?o, eo), (Cl, eo), . . . , (C2'=+i-i, eo)) 

= ((C, ^o), (C, ^2’*-'“-i), (C, ^2-2"-'“-i), • ■ ■ , (C, •^(2''+i-l)2"-'“-i))-f^fc-|-l C^) 

where eo denotes the 0th row of Hn-k-i, i-e., the all-one sequence of length 

Q^n — k— 1 

By the assumption, we should have {^,£j) = 0 (mod 2™+^) where j = 
i ■ 2”“^“^, i = 0, 1, . . . , 2^+^ — 2. Note that k > m -I- 1. From (7), we have 
(C,-^( 2 fc+i-i) 2 ’— '=- 1 ) = 0 (mod 2"*+^). Furthermore, note that HW 
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(a( 2 *’+i-i) 2 "-'^-i) = fc + 1. Taking into account Lemma 4, we can conclude that 
= 0 (mod 2™+^), for HW{aj) = fc + 1, where aj is the binary represen- 
tation of j. This completes the proof. □ 

In the following section, we will use these results to improve the upper bond 
on the nonlinearity of correlation immune functions. 

6 Improving Upper Bounds on Nonlinearity 

The following lemma will be used in proving Theorem 5. 

Lemma 6. Let f be an mth-order correlation immune function on Vn, where 
^n— 1 < 771 < 77 — 2, and ^ denotes the sequence off. If 

then there must be an integer jo, 0 < jo < 2^ — 1, such that HW{ajg) = m + 1 
and = 0, where ajg is the binary representation of integer jo- 

Proof. Since / is an TTith-order correlation immune function on Vn, due to Theo- 
rem 3 of [10], we have {f, £) =0 (mod 2™+^), where £ is any row of Hence 
(i,i) yf 0 implies that |(^,-^)| > 2"*+^. Using Parseval’s equation (Page 416 [7]), 
we have #3 < 

Note that the number of vectors a in U, satisfying HW{a) = m + 1, is 

equal to > 22 ’^- 2 m -2 there must be a vector ajg such that 

HW{ajg) = 777 -1- 1 and ajg ^ 3*. As a result, we have {f,£jg) = 0, where ajg is 
the binary representation of jo. □ 



Theorem 5. Let f be an mth-order correlation immune function on V„, where 
— 1 < m < n — 2. If then Nf < 2”“^ — 2™+^, where 

the equality holds if and only if f is is a 2{n — m — 2) th- order plateaued function. 

Proof. By Lemma 6, there must be a vector ajg such that HW(ajg) = m 1 
and {f,£jg) = 0. Now using Lemma 5, we have 

(C,^) = 0 (mod2™+2) (8) 

where £ is any row of Lemma 1 implies that Nf< 2"“^ — 2™+^. 

Assume that N f = 2”“^ — 2™+^. From Lemma 1, we have 

max{|(^,£,)|,0<i<2"-l} = 2™+2 (9) 

Combining (8) and (9), we can conclude that {f,£) = 2™+^ if {^,£) ^ 0. This 
proves that / is a 2 (t7 — m — 2)th-order plateaued function. 

Conversely, if / is a 2(n — m — 2)th-order plateaued function, due to Propo- 
sition 1, we must have Nf = 2"'“^ — 2™+^. □ 
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Let n and m be two integers with n > m > 0. We claim that that the 
following inequality holds: 



n \ 

m + 1 J 



>( 






n — m 



( 10 ) 



To prove the claim, we set p{i) = 0 < * < |(n — m— 2). 



Since 



m+1 



n 

n — m — 1 



, it is easy to verify that 



f ^ \ / pi^)p{^) ■ ■ ■ PiU^ - m - 2) - l)( "+r,;(;^ ), if n-m is even 

\^m+ly ( p(0)p(l) • • • p(|(n — m — 3)), if n — m is odd ' 

In addition, one can also verify that p satisfies the condition of p(i) < p{i — l). 
Hence 

( n \ f(p(i(n-m- 2))i(—2)(^^), if n-m is even 
l^m+ly 1 _ 77 ), _ if n — m is odd 



There exist two cases to be considered: n — m is even and n — m is odd. 

In the former case, we note that p{\{n — m — 2)) = Due to (12), 

we obtain ( ” , ) > (n+™+ 2 )n-m-i 

In the latter case, as p(i(n — m — 3)) = ("+™+3)++"^+i) > ^ ra+m+ 2 ^2 leaking 

into account (12), we have l) ^ Thus the inequality in 

(10) is indeed true. 



Theorem 6. Let f be an mth-order correlation immune function on Vn- If m 
and n satisfy the condition of0.6n — 0.4 < m < n — 2, then Nf < 2"“^ — 2"*+^, 
where the equality holds if and only if f is also a 2{n — m — 2)th-order plateaued 
function. 



Proof. One can verify that 



n + Ai + 2 n A 2 + 2 
n — Ai n — X 2 

for n > Ai > A 2 > 0, where Ai and A 2 are not necessarily integers. Since 
m > 0.6n — 0.4, we have 



\n-m-l ^ ^ n + 0.6n 0.4+2 _ 22 n- 2 m -2 

n- (0.6n-0.4) ’ 



>(^ 



By using (10), we can conclude that ^ +l) >2^” ^ . Taking into account 

Theorem 5, we know that the theorem is indeed true. □ 
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Part (i) of Theorem 4 in [10] states that the nonlinearity Nf of an mth-order 
correlation immune function / on 14i satisfies Nf < 2"'“^ — 2™, when m > |n— 1. 
Our Theorem 6 represents an improvement on the result in [10], especially for 
the case of m > 0.6n — 0.4. 

As a consequence of Theorem 5 or Theorem 6, a correlation immune function 
that achieves the maximum nonlinearity for such a function, also satisfies all the 
properties of plateaued function, as discussed in Section 4. As a result, by taking 
into account Theorem 2, we have 

Corollary 1. Let f be an mth-order correlation immune function on Vn- If 
0.6n — 0.4 < m < n — 2, then Nf < 2”“^ — 2™“*'^, where the equality holds if 
and only if f is also a 2{n — m — 2) th- order plateaued function or the equality in 
Theorem 2 holds, i.e., = 2”+^™+^. 

An (n, m, t) -resilient function is an n-input m-output function or mapping F 
with the property that it runs through every possible output m-tuple an equal 
number of times when t arbitrary inputs are fixed and the remaining n — t inputs 
runs through all the 2"“* input tuples once. The concept was introduced by Chor 
et al in [3] and independently, by Bennett et al in [1]. Comparing the definition 
of resilient functions with that of correlation immune functions, one can see 
that an (n, 1, t)-resilient function coincides with a balanced tth-order correlation 
immune function on Vn- In this context. Theorem 1 of [15] is of special interest 
to practitioners alike, as it shows that each non-zero linear combination of the 
component functions of an (n, to, t)-resilient function is also a balanced tth-order 
correlation immune function on Vn, giving rise to 2™ — 1 distinct, balanced tth- 
order correlation immune functions in total. 

To close this section, we point out a result which follows from Theorem 2 of 
[10] and Theorem 2 in this paper. 

Corollary 2. Let f be an {n,\,m) -resilient function, where \n — 2 < to < n— 3. 
Then the nonlinearity Nf of f satisfies Nf < 2"“^ — 2™+^, where the equality 
holds if and only if f is also a 2(n — to — 2)th-order plateaued function or the 
equality in Theorem 2 holds, i.e., 

7 Tightness of the Upper Bound 

As Theorem 6 represents an improved upper bound on the nonlinearity of all 
the correlation immune functions including both balanced and unbalanced ones, 
we are further interested in the question as to whether the upper bound is tight 
or not. It turns out that the question can be answered in an affirmative way 
for balanced correlation immune functions. The approach we take is to actually 
demonstrate the existence of TOth-order correlation immune, balanced functions 
on Vn, whose nonlinearity Nf satisfies Nf = 2"“^ — 2™+^. 

We note that [11] is the earliest paper to study the nonlinearity of cor- 
relation immune functions. Of particular importance are Theorems 9 and 14 
in [11] which happen to be also relevant to the current work. Theorem 9 of 
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[11] proved the equivalence of two different methods for constructing correlation 
immune functions, while Theorem 4 in the same paper showed how to obtain 
highly nonlinear correlation immune functions. Let integers n and m satisfy 
m + 2 > and n > 16. For such n and m, there exist non- 

zero vectors in Vm+ 2 , say 7o,7i, • . . , 72 "-"*- 2 -i, such that HW{'jj) > m + 1, 
where j = 0, 1, ... , 2 ”“'"“^ — 1. Define a mapping P from Vn-m -2 to Vm +2 
such that P{Vn-m- 2 ) = { 70 , 7i: • ■ • , 72 "-"‘- 2 -i}, where P(K-m- 2 ) = {-P(<5)|<5 G 
Vn-m- 2 }- Based on P, we construct a function / on Vn by /(x) = f{y,z) = 
P{y)z"'" where x = {y, z), y G Vn-m -2 and z G Vm+ 2 - By using Theorems 9 and 
14 of [11], modifying the relevant parameters accordingly, and fixing t to 1, we 
can construct an (n, 1, m)-resilient (balanced) function / whose nonlinearity Nf 
reaches the upper bound of 2 ”“^ — 2 ™+^. 

As a concrete example, let n = 9 and to = 5. Then m -I- 2 > 2"“™“^. 
Set 70 = (1,1, 1,1, 1,1,1), 71 = (1,1, 1,1, 1,1,0), 72 = (1, 1, 1, 1, 1, 0, 1) and 
73 = (1, 1, 1, 1, 0, 1, 1). Then each 7 ^ G V^ and HW ( 7 ^) > 6 . Define a mapping P 
from V 2 to V^ such that P(0, 0) = 70 , P(0, 1) = 71 P(l, 0) = 72 and P(l, 1) = 73 . 
Based on P, we construct a function / on Vg by /(x) = /(y, z) = P{y)z^ where 
X = {y, z), y GV 2 and z G IZ 7 . Theorems 9 and 14 in [11] tell us that / is a 5th- 
order correlation immune function on Vg, and the nonlinearity Nf oi f achieves 
iVy = 2® — 2® = 192, the highest possible value for such a function. Since each 
7 j is non-zero, / is balanced. One can verify that the function / takes the form 
of 



fiV: z) = yize © y2Z7 © yiy2{z-> © ze © z^) 

©Zi © Z2 © Z3 © Z4 © Z5 © Zg © Z7 



where y = {yi,y 2 ) and z = (zi, Zg, Z3, Z4, Zg, Zg, Z7). 

The above discussions indicate that the upper bound (2"“^ —2™+^) is indeed 
tight for balanced correlation immune functions. While we have not been able 
to identify whether the bound is also tight for unbalanced correlation immune 
functions, its implication would be marginal, due to the fact that unbalanced 
correlation immune functions have found little use in practice. 

To close this section, let us note that in [13], an unbalanced 3rd-order cor- 
relation immune function on Vg whose nonlinearity achieves 2® — 2® = 24 is 
constructed. This particular function does not contradict Theorem 5 or Theo- 
rem 6, as the specific parameters n = 6 and to = 3 satisfy neither ^ ^ ^ 

22n-2m-2 ^ > q 0^.^ _ q ^ 

8 Concluding Remarks 

Three separate research groups, Sarkar and Maitra, Tarannikov [13], and Zheng 
and Zhang, have apparently considered the same question on the upper bound 
on nonlinearity of correlation immune functions, independently of one another. 
All three groups submitted their research results to CRYPT02000, although 
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only Sarkar and Maitra’s got accepted. Our current paper contains essentially 
the same research results included in our CRYPT02000 submission, minus those 
that happened to overlap results in Sarkar and Maitra’s CRYPT02000 paper. 

Theorem 6 leaves open as to whether the condition of 0.6n — 0.4 <m<n — 2 
can be relaxed to^n— l<m<n — 2 where n > 6. We have recently successfully 
solved this problem [17]. 

It would also be interesting, albeit purely from a theoretical point of view, 
to examine whether the bound Nf = 2”“^ — 2”’'“*'^, where m > 0.6n — 0.4, is also 
tight for unbalanced mth-order correlation immune functions, and if it is, how 
to construct such functions. 
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Abstract. We present a new non-interactive public key distribution sys- 
tem based on the class group of a non-maximal imaginary quadratic order 
Cl{Ap). The main advantage of our system over earlier proposals based 
on {’L/nL)* [19,21] is that embedding id information into group elements 
in a cyclic subgroup of the class group is easy (straight-forward embed- 
ding into prime ideals suffices) and secure, since the entire class group is 
cyclic with very high probability. 

In order to compute discrete logarithms in the class group, the KGC 
needs to know the prime factorization of Ap — A\p^ . We present an 
algorithm for computing discrete logarithms in Cl(Ap) by reducing the 
problem to computing discrete logarithms in Cl{Ai) and either FJ or 
F *2 • We prove that a similar reduction works for arbitrary non-maximal 
orders, and that it has polynomial complexity if the factorization of the 
conductor is known. 

Keywords: discrete logarithm, non-maximal imaginary quadratic order, 
non-interactive cryptography, identity based cryptosystem 



1 Introduction 

Public-key cryptography is undoubtedly one of the core techniques used to enable 
authentic, non-repudiable and confidential communication. However, a general 
problem inherent in public-key systems is that one needs to ensure the authen- 
ticity of a given public key. The most common way to solve this problem is to 
introduce a trusted third party, called a Certification Authority (CA) , which is- 
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sues certificates for public keys^. While this approach is widely used in practice, 
it would be desirable to have an immediate binding between an identity IDs and 
its corresponding public key b, which allows one to avoid the tedious verification 
of certificates. This leads to the notion of identity based cryptosystems. 

Although the paradigm of identity based cryptography was already intro- 
duced by Shamir in 1984 [23], it seems that Maurer and Yacobi [19] were the 
first to propose a non-interactive identity based public key cryptosystem in which 
Bob’s public key b can be derived efficiently, solely from his public identity infor- 
mation IDb, by computing a publicly-known embedding function b = /(IDs). 
The main idea is to use an (ideally cyclic) group G (generated by g) in which 
exponentiation is not only a one-way-function but a trapdoor-one-way-function. 
The key generation center (KGC), a trusted third party responsible for dis- 
tributing the private keys, knows the trapdoor information and hence is able to 
compute discrete logarithms in G. Thus, the KGC computes Bob’s private key b 
such that = b = /{IDb). The KGC hands over the secret key b to Bob, who 
can use this key in a conventional ElGamal- or Diffie-Hellman setup. As soon as 
all users are equipped with their corresponding secret key, the KGC can destroy 
the trapdoor-information and may cease to exist. 

Maurer and Yacobi ’s initial proposal was to set up a discrete logarithm based 
system in G = (Z/nZ)*, where n = pi- ■ ■ p^-, Pi prime, such that only the KGC, 
which knows the factorization of n, is able to compute discrete logarithms in G. 
However, this approach has a number of drawbacks which render such a scheme 
impractical [20,18,17]. 

In this paper, we show that using the class group Gl{Ap) of a non-maximal 
imaginary quadratic order is much better suited for this purpose. As in the orig- 
inal scheme, the KGC knows trapdoor information (the prime factorization of 
Ap) which enables it to compute discrete logarithms, while for anybody else the 
discrete logarithm problem (DTP) is assumed to be intractable. We generalize 
the recent result from [12], valid for the very special case of totally non-maximal 
orders with prime discriminant, to arbitrary non-maximal imaginary quadratic 
orders. The resulting algorithm reduces the problem of discrete logarithm com- 
putation in the class group of a non-maximal order to computing discrete loga- 
rithms in the much smaller class group of the corresponding maximal order and 
a small number of finite fields. Only the KGC, which knows the factorization of 
Ap, can perform this reduction. 

As noted above there are a few advantages to our approach. Unlike the case of 
(Z/nZ)*, it is heuristically easy to find class groups Gl{Ap) which are cyclic, and 
hence the embedding of an identity IDb into a group element b, for which the 
discrete logarithm exists, is straightforward. As the results from [20,18] demon- 
strate, it seems to be no trivial task to find an embedding into a subgroup of 
(Z/nZ)* which does not facilitate factoring n. In fact, the only secure embedding 
method for (Z/nZ)* seems to restrict n to having only two large prime factors pi 



^ We assume throughout this work that Alice (A) wants to encrypt a message m G 
Z>o intended for Bob (B). We denote Bob’s unique identity, for example his email- 
address, hy IDb and his public key by b 
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and p 2 , and the workload for the KGC is consequently very high. Furthermore, 
since one chooses Pi — 1 smooth and uses Pohlig-Hellman’s simplification together 
with Shank’s Baby-Step Giant-Step algorithm, the time needed for generating k 
user keys is proportional to k. 

In contrast, we use two different subexponential algorithms for the key gener- 
ation. After the initial computation of relations over the factor bases, the work- 
load for each individual key generation is very modest. For the computation 
of discrete logarithms in the class group of the maximal order, Cl{Ai), we use 
an analogue of the Self-Initializing Quadratic Sieve (SIQS) factoring algorithm 
[14,13] and for the computation of discrete logarithms in F* we use the Spe- 
cial Number Field Sieve, which recently was used for the solution of McGurley’s 
challenge [24]. 

This paper is organized as follows: in Section 2 we provide the necessary back- 
ground and notation for non-maximal imaginary quadratic orders. The next sec- 
tion contains the discrete logarithm algorithm for arbitrary non-maximal imagi- 
nary quadratic orders, and in Section 4 we present our new non-interactive public 
key cryptosystem. In order to save space, the proofs of most results have been 
omitted. These proofs, as well as computational results, will be given in the full 
paper [10]. 

2 Non-maximal Imaginary Quadratic Orders 

The basic notions of imaginary quadratic number fields can be found in [1,2]. 
For a more comprehensive treatment of the relationship between maximal and 
non-maximal orders we refer to [5,9,12]. 

Let O/Sf denote the non-maximal quadratic order of discriminant Af = Aif^ 
with conductor /, and let Oai denote the corresponding maximal order. When 
the conductor is prime, we will use Oap and Ap. By Cl{Af) and Cl{Ax) we 
denote the ideal class groups of Oaj and Oa^i respectively. The class num- 
bers h{Af) and h{Ai) are the orders of these groups. Lower-case Gothic letters 
a, b, . . . denote ideals in OAf and upper-case Gothic letters denote ideals in Oai ■ 
Ideal equivalence is denoted by a ^ b, and the class of all ideals equivalent to 
a is denoted by [a]. Throughout, we will use A without subscript to denote the 
discriminant of an arbitrary quadratic order, maximal or non-maximal. 

Our cryptosystem makes use of the relationship between a non-maximal order 
of conductor / and its corresponding maximal order. Any non-maximal order can 
be represented as OAf = 'A+ JOai- If = Ij is called a totally 

non-maximal order. An integral ideal a is called prime to / if gcd(jV(a), f) = 1. 
It is well-known that all O/i^-ideals prime to the conductor are invertible, and 
in every ideal equivalence class there is an ideal which is prime to any given 
number. We denote the principal 0/ij.-ideals prime to / by VAfif) £md all 
fractional ideals which are prime to / by lAfif)- There is an isomorphism 



( 1 ) 
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SO we can “ignore” the ideals which are not prime to the conductor if we are 
only interested in the class group Cl{Af). 

There is an isomorphism between the group of 0/^^. -ideals which are prime 
to / and the group of O/ij^-ideals which are prime to /, denoted by and 

IaiU), respectively. 

Proposition 1. Let Oaj be an order of eonductor f in an imaginary quadratic 
field Q(-yz\i) with maximal order Oai ■ 

(i.) //2l G I/ii(/), then a = 2tn OAf G dLAj{f) and A/’(2l) = M{a). 

(ii.) If a £ lAfif), then = aOAi Gl/i^(/) and M {a) = M {%) . 

(iii.) The map : 21 1 — > 21 fi OAf induces an isomorphism lA^{f)^lAf{f)- 
The inverse of this map is : a r-> aOAj_- 

Thus we are able to switch to and from ideals in the maximal and non-maximal 
orders via the map (p. The algorithms GoToMaxOrder(a, /) to compute (p~^ 
and GoToNonMaxOrder(2t, /) to compute p can be found in [9]. If a = aZ -b 

Z = (a, b) and 2t = JTL -b = {A, B) are reduced ideals, then 

these algorithms need 0(log(|Z\i|)^) and 0(log(|zi/|)^) bit-operations respec- 
tively. 

It is important to note that the isomorphism p is between the ideal groups 
H-Ax if) and lAf (/) and not the class groups. If, for 21, *8 G Iai (/) we have 21 ^ 
05, it is not necessarily true that p{^) On the other hand, equivalence 

does hold under p~^ . More precisely we have the following: 

Proposition 2. The isomorphism p~^ induces a surjective homomorphism 
: Cl{Af) — >• CZ(Z\i), where [a] >->■ 

We now focus on the kernel Keiffif,]) of this map, which will turn out to be 
of central importance for the computation of discrete logarithms in Cl{Af). In 
particular, we will need to compute discrete logarithms of elements in Ker(</>^;^). 
Representing elements of Ker(^^j^) as ideal equivalence classes is completely 
inadequate for this purpose since we would have to compute discrete logarithms 
in Cl{Af). Fortunately, there exists an alternative representation which allows 
us to reduce the problem of computing discrete logarithms in Ker(cj)Q^) to that 
in a small number of finite fields. 

Proposition 3. The map if : {Oa^I fOA^)* Ker(c^p;^), [a] >-)■ (aO/ii)], is 

a surjective homomorphism. 

This homomorphism suggests the following representation for ideal classes in 
the kernel: 

Definition 1. Let [a] = [a: -b ytu] G {O a^I fO a^)* and let a ~ p{aOAi) be a 
reduced OAf -ideal whose equivalence class lies in Kei{(f>Ql). Then the pair (x,y) 
is called a generator representation for the equivalence class [a]. 
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Remark 1. Note that this generator representation (cc, y) for the class of a is not 
unique. It is easy to see that (kx, ky), k G (Z//Z)*, is also a generator represen- 
tation for the class of a. This means that we have a ^ -I- yuj)OAi) ^ 

ip{{kx + kyuj)OAj- In other words, = (C>zij//C>ziJ*/*((Z//Z)*), 

where i denotes the natural embedding of Z//Z into {OaiI JOai)* , as illus- 
trated by the exact sequence (7.27) in [5, p.l47]. 

Our reduction of the discrete logarithm problem in Cl{Af) to Cl{Ai) and 
finite fields requires computing various preimages of elements in under 

the map 'tjj. Algorithm 1 (Std2Gen) accomplishes this task. The algorithm Reduce 
reduces an ideal 21 given in standard representation and simultaneously computes 
a reducing number 7 € Oai of the form {x + yy/Ai)/2 such that 2I/7 is reduced 
(see, for example, [14, Algorithm 2.6, p.l6]). 



Algorithm 1 Std2Gen 

Input: The standard representation (a, b) of a reduced -ideal a = aZ + "ZZ, 
representing a class in Ker(0^(), and the conductor /. 

Output: A generator representation (x^y) of the class [a] G Ker(<^^(). 

{A,B) GoToMaxOrder(a, /) 

(®,7) Reduce(A, B) 
if 0 Oai then 

return(’Error! a 0 Ker(</>^))!’) 
end if 

if Ai = 0 (mod 4) then 
X xj2 (mod /) 
y ^ y/2 (mod /) 
else 

x<^{x-y)l2 (mod/) 
y -^y (mod /) 

end if 

return))®, y)) 



3 The DLP for Arbitrary Cl{Af) 

In this section we generalize the result from [12]. We show that given the con- 
ductor / and its prime factorization one can reduce the DLP in an arbitrary 
Cl{Af) to the DLP in various smaller groups. More precisely, we first show that 
the computation of discrete logarithms in Cl{Af) can be reduced to the com- 
putation of discrete logarithms in the class group Cl{Ai) of the maximal order 
and the computation of discrete logarithms in Furthermore, we show 

that the latter problem boils down to the computation of discrete logarithms in 
a small number of finite fields. 
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It should be noted that our method here is in essence a special case of the 
more general methods employed by Cohen et al. to compute discrete logarithms 
in ray class groups [3]. The class group of a non-maximal order in any number 
field, not only degree 2, can be viewed as a ray class group of the maximal order, 
where the modulus is simply an integer, the conductor of the non-maximal order. 
Our exposition here is a reformulation of these results in terms of the simpler, 
special case of non-maximal orders using the language of [12]. In addition, we 
prove that the reduction of the DLP in Cl{Af) to computing discrete logarithm 
computations in Cl{Ai) and a small number of finite fields is of polynomial 
complexity. 

We start with an algorithm which reduces the DLP in Cl{Af) to the DLP 
in Cl{Ai) and Since the map V' : {O a^/ fO a-^)* — >■ Ker(</)p;^) given in 

Proposition 3 induces the isomorphism Ker((/)p;^) = {OaiI fOAj* lid'll fA}*), 
we will reduce the latter DLP to computations in {Oai/ /Oai)* ■ Thus, our 
algorithm makes use of the following two methods: 

- DLPinCI(0,2l) 

Accepts two reduced O/ij-ideals 0,21 as input and returns x G Z with 0 < 

X < h{Ai) such that 0“ ^ 21, or x = —1 if no such x exists. 

— DLPinKerphi( 7 , a, |Ker(0pj^)|) 

Accepts two generator representations 7, a of classes in Ker{(j)'^i) such that 

[7], [a] G {OaiI /Oai)* as input and returns x G Z with 0 < x < |Ker(<^^;^)| 

such that = '0([a]) in or x = — 1 if no such x exists. 

Furthermore, we assume that h{Ai) is known. This is no practical restriction, 
since the best currently known algorithm [14,13] for computing discrete loga- 
rithms in Cl{Ai) needs to compute h{Ai) and the group structure of Cl{Ai) 
before the actual DL-computation starts. Secondly, if there were any other al- 
gorithm DLPinCI with the above properties, then one could use it to compute 
h{Ai), as shown in the full paper [10]. 

Algorithm 2 (ReduceDLP) reduces the DLP in Cl{Af) to the DLP in Cl{Ai) 
and Ker(<^^;^) = [Oai! fOAi)* fZ)*). The proof of correctness can be 
found in the full version of the paper [10]. 

Proposition 4. Given the conductor /, the class number h{Ai) and the order 
of the kernel ]Ker(^^;^)j one can reduce the DLP in Cl{Af) in 0(log(jZ\jj)^) 
bit- operations to the DLP in Cl{Ai) and Ker{cj)Qj). 

Thus, in order to compute discrete logarithms in Cl{Af), we need efficient al- 
gorithms for computing discrete logarithms in Cl{Ai) and The subex- 

ponential algorithm described in [13, Algorithm 3.3] is the most efficient algo- 
rithm known for computing discrete logarithms in Cl{Ai). We now consider the 
DLP in Ker(^pj^) = [O a^I f^ A^)* liiffZj flf)*) more closely. 

By the Chinese Remainder Theorem (see, for example, [15, p.ll]), the DLP in 
{OAjfOA.riimfZ)*) boils down to DLPs in {O aMC O a,T ! p^Z)*) 
for prime powers where / = Y[pT- Furthermore, this problem can be effi- 
ciently reduced to the prime case {OAi/PiOAi)* /i{^pj- We give an algorithm 
(ReducePe2P) for this reduction in the full version of the paper [10]. 
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Algorithm 2 ReduceDLP 

Input: Two reduced O/i^-ideals 0 , a, the conductor /, the class number ii(Zli), and 
the order of the kernel |Ker(0p,^)| = j Op | / (l - 

Output: The discrete logarithm x, such that ~ o, with 0 < a: < h(Af), or x = — 1, 

if no such x exists. 

{Compute DL in C/(Ai)} 

0 ^GoToMaxOrder(g, /) 

2t ^GoToMaxOrder(a, /) 
xi ■<— DLPinCI(0, 2i) 
if a:i = — 1 then 
return( — 1) 
end if 

{Compute DL in (Ozii//Oui)*} 

a •<— Std2Gen(a/0“'i , /) 

7 ■(-Std2Gen(0'*(^P, /) 
c ■<— DLPinKerphi( 7 , a, |Ker(()ipj^)|) 
if c = — 1 then 
return( — 1) 
end if 

{Combine partial results to get DL in Cl{Af^)} 
a; ■<— c ■ h{Ai) + xi 

return(a;) 



Proposition 5. The DTP in (0/ij/p®(!l^J*/z((Z/p®Z)*) can be reduced in 0{e- 
(logp®)^) bit- operations to 2e DL-computations in {O / pO a-i)* / ■ 

Corollary 1. If e = 0((logp)“) for some a = 0(1), then the DTP in 
{OaiIp^Oai)* li{{'^lp'^'^)*) can be reduced in polynomial time (in logp) to the 
DTP in {OAjpOA,r/i(V;)- 

Using ReduceDLP and ReducePe2P allows us to reduce the DLP in Cl{Af) 
to DLPs in Cl{Ai) and (Oai/pOai)* / iC^p)- As shown in [12,11], (Oai/pOai)* 
is isomorphic to either F* x F* or F*a, depending how p splits in Oai- This 
immediately leads to the central result of this section. 

Theorem 1. If the prime factorization of the conductor f — Yii=iPT known 
and €i = 0((logpi)“) for some a = 0(1) then one can reduce the discrete 
logarithm problem in Cl{Af) in polynomial time (in log Af) to the computation 
of logarithms in Cl{Ai) and the following groups (1 < i < k): 







Proof. If the conductor / and its prime factorization are known, then one can use 
ReduceDLP (Algorithm 2) to reduce the DLP in Cl{Af) to the DLP in Cl{Ai) 
and Ker{cj)((,j). By Proposition 4 this is possible in polynomial time in logZ\/. 
By the Chinese Remainder Theorem (using the known factorization of /) the 
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DLP in Ker{(j}Q^) = {OaiI /Oai)* lid’ll f^)*) is nothing more than the DLP in 
groups of the form (OaiIpT^^i)* I di'^lpT'^)*)^ which can, using ReducePe2P 
(from [10]) and Corollary 1, be reduced in polynomial time (in logp^) to the 
DLP in (OAi/PiOAi)* because is assumed to be polynomial in logp^. 

It remains to show how one reduces the discrete logarithm problem in 
{C> Ai / pO Ai )* / to discrete logarithm problems in F* or F*a- Suppose we 
have two representatives 7 , a of classes in {Oai/pOai)* for which we want to 
compute the discrete logarithm c such that [ 7 ]^ = [a] in {O a^ / pO A x)* / ■ 
In the inert case {Ai/p) = —1, where {O AijpO a^)* = we have 

(C’zii/pC>zii)*/«(®’p = ®’p 2 /*(Fp. It is well-known that there always exists a 
surjective homomorphism from F *2 to F* 2 /i(Fp. Thus, we first solve the DLP 
7 ^^ = a (mod pOai) by simply solving the corresponding DLP in F* 2 - Tak- 
ing c = c' mod (p -I- 1) yields the required solution to the DLP [ 7 ]° = [a] in 

{OAjpOAdv^in)- 

We now restrict our attention to the split case (Z\i/p) = 1, where we have 
{OaiIpOaiY = ®’p X IF*. The element 7 = (cci,yi) maps to (xi modp, j/i mod 
p) € F* X F* and similarly a = (x 2 , P 2 ) maps to (x 2 mod p, p 2 mod p). The DLP 
in {O A^ / pO aY * becomes 

(a;i,yi)'' = ^a; 2 ,P 2 ) (in F* x F^ 

which in turn yields the simultaneous DLP’s 

Xi = 1 x 2 (mod p), Pi = ly 2 (mod p) . 

Since these two DLP’s must be solved for the same c and /, we can combine 
them and obtain the single DLP in F* 

(modp) 

yij \y 2 j 

from which we can find the desired value of c. 

As noted in [ 8 ] , this simple strategy can be used to improve the general maps 
from [ 12 , 11 ]; it is shown that in this case there not only exists a surjective homo- 
morphism F* X F* — >■ Ker((/)p;^), but even an efficiently computable isomorphism 
F; = Ker(</>^i). □ 

Note that the central result of [12] now is nothing more than an immediate 
corollary. 



3.1 Example 

We illustrate the reduction of discrete logarithm computations in Cl{Af) via a 
small example. Suppose Z\i = —1019, / = 23, and Af = Aip = —539051. In 
this case, both Cl{Af) and Cl{Ai) are cyclic with h{Ai) = 13 and h{Af) = 
h{Ai){23 — 1) = 286. The equivalence class represented by the reduced ideal 



-7 -b V-539051 



Z = (15,-7) 



0 = 15Z -b 



2 
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generates Cl{Af). 

Suppose we wish to compute the discrete logarithm of [a] with respect to the 
base [g] in Cl{Af), where 



a = IIZ -I- 



9 -b V-539051 
2 



Z= (11,9) . 



That is, we want to find x such that ^ o. Since g generates Cl{Af), we know 
that such an x exists. Following ReduceDLP (Algorithm 2), we first compute 
[ 0 ] = [())“j^(g)] and [ 21 ] = and solve the discrete logarithm problem 



0^1 - 21 



in Cl{Ai). We have 0 = 15Z -|- = (15,1), 2t = (11,9), and we easily 

compute xi = 9. 

At this point we know that x has the form x = c ■ h(Ai) + x\ = 13c -I- 9, 
and it remains to compute c. Again following ReduceDLP (Algorithm 2), we 
compute generator representations a , 7 of [a], [ 7 ] S JOai)* such that 

"0([a]) = [n/g^^j and '0([7j) = [g^*-^^^]- Following Std2Gen (Algorithm 1), we 
first compute 

b^a/g"i -a/g^ = (311,277) 

and 

c- - gi3 = (297,295) . 

To find a and 7 we compute the principal ideals 25 = (/j“^(b) and 0 = <p“^(c), 
and reduce them while simultaneously computing their modulo JOai reduced 
generators, which we take as a and 7. We obtain 25 = (311,-15) = (a) and 
0 = (297, —13) = (7) where 



a = —8 -I- Ilo, 7 = —7 -I- lio 



and ^ 

To compute c, we need to solve the discrete logarithm problem 

[ 7 ]= = [a] (in Ker(<^pi) = {OaJ JOa^Y IKW mi) • 

For this example, we have {Ai/f) = (—1019/23) = 1, and thus {OaiI JOaI* — 
F 23 X F 23 by [12, Lemma 8 ]. Since a; = 14 (mod 23) and a; = 10 (mod 23), 
we obtain 

7 !->■ (—7 -I- \lo mod 23, —7 -I- laJ mod 23) = (7, 3) G F 23 x F 23 

and 

a !->■ (—8 -I- luj mod 23, —8 -I- luj mod 23) = ( 6 , 2) G F 23 x F 23 . 

Since Ker((//)/) = (F* x Fp/i(Fp, we need to find c by solving the discrete 
logarithm problem (7,3)° = /(6,2) in FJ 3 x F 23 for every I G F 23 . This yields 

7° = 6 / (mod 23), 3° = 2/ (mod 23), 
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and we combine these two discrete logarithm problems to obtain one discrete 
logarithm problem in F 23 : 

(7/3)= = (6/2) (mod 23) ^ 10= = 3 (mod 23) . 

Solving yields c = 20, and finally x = 13 • 20 + 9 = 269. It is easy to verify that x 
is indeed the desired discrete logarithm: simply compute the reduced ideal 
and verify that it is equal to the reduced ideal a. 



4 Towards Practical Non-interactive Cryptosystems 

Before we explain our system setup we list the crucial properties: 



Required Properties 



1 . 



The discrete logarithm problem (DLP) in Cl{Ap) without knowing the fac- 
torization of Ap = Aip^ is infeasible. To determine bounds for Ai and p, we 
make use of the heuristic model from [7], which is a refinement of Lenstra 
and Verheul’s approach [16], since it also takes into account the asymptoti- 
cally vanishing o(l)-part in subexponential algorithms. We will now derive 
bounds for the parameters such that an attacker would need to spend about 
90, 000 MIPS years to break the system. This approximately amounts to a 
ten-fold higher workload than the recent factorization of RSA155 and hence 
corresponds to the very minimum requirements. The estimates in [7, Table 3] 
state that Ap should have at least 576,667,423 bits to prevent factoring Ap 
with the GNFS, factoring Ap with ECM and computing discrete logarithms 
in Cl{Ap) with the SIQS-analogue [14], respectively. 

1.1 Ap is large enough that using the subexponential algorithm from [13] to 
directly compute discrete logarithms in Cl{Ap) is infeasible. Ap > 2^^^ 
implies an expected workload of more than 90, 000 MIPS years. 

1.2 Ap cannot be factored to reduce the DLP to DLPs in Cl{Ai) and F* (or 



FG). 



1.2.1 Ap is large enough so that the Number Field Sieve would need more 
than 90, 000 MIPS years. This yields Ap > 2®^®. 

1.2.2 Ai and p are large enough that it would take more than 90, 000 MIPS 
years to find them with the Elliptic Curve Method. This implies 

Z\i,p>2222. 



2. Ai,p must be small enough to enable the KGC to compute discrete log- 
arithms in Cl{Ai) and F* using subexponential algorithms. Ai,p < 2®™ 
seems to be feasible. 

3. Cl{Ap) must be cyclic. 



It is easy to see that the following setup satisfies all above requirements. 
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System Setup 

1. The KGC randomly chooses a prime g = 3 (mod 4), g > sets Ai = —q 
and computes h{Ai) and the group structure of Cl{Ai) with the algorithm 
from [14]. The Cohen-Lenstra heuristics [4] suggest that Cl{Ai) is cyclic 
with probability > 0.97. If Cl{Ai) is not cyclic, the KGC selects another 
prime g until it is cyclic. 

2. The KGC chooses a prime p > with {Ai/p) = 1 and gcd(p— 1, h{Ai) = 1 
such that the SNFS can be applied as in [24], and computes Ap = Aip^. 
The gcd condition ensures that Cl{Ap) is cyclic. 

3. The KGC computes a generator g of Cl{Ap) and publishes it together with 

Ap. 

Given a generator 6 of Cl{Ai), which the KGC can easily obtain during the 
computation of Cl{Ai) [14, Algorithm 6.1], it is also easy in practice to find a 
generator g of Cl{Ap) with the additional property that = 0. The KGC 

repeatedly selects random values of a G Oai and takes the first g = cj){a&) 
such that QH^p)/di ^ £qj. positive divisor di of h{Ap). Although h{Ap) 

is approximately as large as in practice it has sufficiently many small 

factors that this condition can be verified with high probability. 



User Registration 

1. Bob requests the public key b corresponding to his identity IDb at the KGC. 

2. The KGC verifies Bob’s identity, for example, using a passport, and starts 
with the key generation. 

3. The KGC computes the 128-bit hash id = h{IDB) using, for example, MD5 
[22], of Bob’s identity and embeds id into a group element of Cl{Ap) by 
taking the largest prime pb < id, for which {Ap/pB) = 1 and computing the 

prime ideal b = psZ-|- where Bb is the uniquely determined square 

root of Ap mod 4ps with 0 < 6s < pb- Note that b is already reduced, 
since I > 2^^® > Pb- If the KGC recognizes that b is already assigned 
to another user it will ask Bob to choose another identity, for example, his 
postal address. 

4. Finally, the KGC computes the discrete logarithm b such that g*” ^ b us- 
ing the secret knowledge of the conductor p and the reduction procedure 
described in the Section 3, and returns b to Bob. 

As soon as all users are registered this way the KGC can destroy the fac- 
torization of Ap and cease to exist. The users can obtain any other user’s au- 
thentic public key simply by hashing that user’s identity and computing the 
largest prime ideal whose norm is less than the hash value. Each user has a 
public/private key-pair (a, a) with a ^ g“, so discrete logarithm-based protocols 
such as Diffie-Hellman or ElGamal can be directly applied in the class group 
Cl(Ap). 
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Preliminary experiments, together with computational experience using the 
subexponential algorithms from [13] and [24] , indicate that a KGC with modest 
computational resources, for example, a small network of Pentium processors, 
should be able to set up a key distribution system using p, g ~ at least. For 
such an example, we estimate that after a precomputation of about 3 days on a 
cluster of 16 550 Mhz Pentiums IIPs for computing the class group Cl{Ai), each 
user registration would take about 1 day on a single 550 Mhz Pentiums III, the 
vast majority of this time being spent on the computation of discrete logarithms 
in F*. However, adding more machines to the cluster yields a linear speedup in 
both the precomputation stage and part of the user registration stage. Thus, 
although this level of complexity is far from ideal, unlike the case of (Z/nZ)* it 
is at least possible to set up noninteractive systems with secure parameters in 
Cl{Ap). More detailed computational results will appear in the full version of 
the paper [10]. 
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Abstract. Gryptosystems based on the discrete logarithm problem in 
the infrastructure of a real quadratic number field [7,19,2] are very inter- 
esting from a theoretical point of view, because this problem is known 
to be at least as hard as, and when considering todays algorithms - as 
in [11] - much harder than, factoring integers. However it seems that the 
cryptosystems sketched in [2] have not been implemented yet and con- 
sequently it is hard to evaluate the practical relevance of these systems. 
Furthermore as [2] lacks any proofs regarding the involved approximation 
precisions, it was not clear whether the second communication round, as 
required in [7,19], really could be avoided without substantial slowdown. 
In this work we will prove a bound for the necessary approximation 
precision of an exponentiation using quadratic numbers in power product 
representation and show that the precision given in [2] can be lowered 
considerably. As the highly space consuming power products can not 
be applied in environments with limited RAM, we will propose a simple 
(CRIAD^-) arithmetic which entirely avoids these power products. Beside 
the obvious savings in terms of space this method is also about 30% 
faster. Furthermore one may apply more sophisticated exponentiation 
techniques, which finally result in a ten- fold speedup compared to [2]. 



1 Introduction 

Unlike for imaginary quadratic orders, there is no polynomial time algorithm 
known, which decides whether two given ideals in a real quadratic order Oa are 
equivalent, and consequently it is impossible to set up DL-based cryptosystems 
in real quadratic class groups Cl{A). However, as noted by Shanks [21], there is 
some infrastructure in the cycle of reduced principal ideals, which resembles an 
abelian group. Buchmann, Williams and Scheidler [7,18,19] showed how to con- 
struct a Diffie-Hellman-like key agreement procedure using this infrastructure. 
They (essentially) represent a principal ideal 2t by a pair (a, a), where a = ySt is 

^ CRIAD is an abbreviation for Close Reduced Ideal and Approximated relative 
Distance 
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the reduced ideal closest to 21 and a is a rational approximation to the distance 
log(7) between a and 21. The major problem with their approach is that, because 
it is (in general) impossible to find the closest reduced ideal when using a fixed 
approximation precision, they compute a pair of very close reduced ideals, i.e. 
the left and right neighbour, and uniquely determine the common key in a second 
communication round. Because this additional round is necessary in their setup, 
it is impossible to construct more advanced ~ e.g. signature- ~ protocols. In [2] it 
was proposed to use the exact relative generator 7 G Q{'/A) in power product 
representation [6] and only compute a rational approximation of its logarithm in 
the very end. The claimed that it is possible to avoid the second communication 
round and hence implement advanced cryptographic protocols (c.f. [3]). However 
as [2] lacks any proofs regarding the involved approximation precisions it was 
not clear whether the second communication round, as required in [7,19], can be 
avoided at all, and if, without substantially decreasing the efficiency. 

In this work we will prove a bound for the necessary approximation precision 
of an exponentiation using quadratic numbers in power product representation 
and show that the precision given in [2] may be lowered considerably. For a 
discriminant with 1024 bit and 160 bit secret exponents in a Diffie-Hellman key- 
agreement protocol we will see that a precision of 512 -1-2-1- 160 -I- 3 = 677 bits is 
sufficient, where [2] suggest a precision of 1024 -|- 6 • 160 -I- 6 = 1990 bits for this 
scenario. As in more sophisticated cryptographic protocols it is necessary to have, 
not only an exponentiation but also, a multiplication (and possibly inversion) 
routine available, we will introduce the so called CRIAD-multiplication, which 
is (essentially) associative (see Corollary 1). Using CRIADmult one is able to 
construct a binary exponentiation routine which requires a precision of 512 -|- 
2 -I- 2- 160-1-2 = 836 bits in the above scenario, but entirely avoids power products 
and solely uses floating point numbers for the logarithms. Note that this is very 
important in environments with restricted RAM. This binary exponentiation 
variant (CRIADexp Algorithm 3) with a precision of 836 bits already yields a 
running time, which is about 30% faster than the exponentiation variant using 
power products and an approximation precision of 677 bits. Another important 
feature of this approach is that one may use CRIADmult to implement more 
sophisticated exponentiation routines, which finally result in a ten-fold speedup 
compared to [2]. 

The paper is organized as follows: In Section 2 we will recall the necessary 
basics concerning the infrastructure of the principal class in a real quadratic 
number field. In Section 3 we carry together the necessary definitions concerning 
rational approximations of real numbers. In Section 4 we will recall the repre- 
sentation of principal ideals from [2] and prove lower bounds for the required 
approximation precision to provide a unique representation (see Proposition 2), 
which is necessary to avoid the second communication round, as required in 
[7,19]. In Section 5 we will show what precision is necessary to implement an 
exponentiation technique using power products. In Section 6 we will introduce 
the CRIAD-arithmetic which is necessary to implement more sophisticated pro- 
tocols and allows to avoid the application of power products. In Section 7 we will 
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use this basic arithmetic to derive more sophisticated exponentiation techniques. 
Due to space restrictions we will only give the results and refer to [12] for a de- 
tailed presentation and proofs. We will conclude this work by providing timings 
of a first implementation in Section 8. In the full paper [12] we will also provide 
a complexity analysis and the treatment of more sophisticated cryptographic - 
e.g. signature - protocols. 



2 The Infrastructure of the Principal Class 
in Real Quadratic Number Fields 



In this section we will recall the very basic notions of the infrastructure of the 
principal class in real quadratic number fields needed in the sequel and refer to 
[8,10,1 j 13] for a complete reference. 

Let Q{'/A) be the real quadratic number field of discriminant Z\ > 0 and Oa 

be its ring of integers. We denote O/i-ideals by gothic letters a, b, . . . , 2 t, IB, 

An important invariant of the number field Q(-\/^) is the regulator TZa = loge, 
where e is the smallest unit larger than one. It is well known (see e.g. [20]), that 
the computation of TZa is at least as hard as factoring A. 

Two O/i-ideals a and b are called equivalent if there is a 7 G Q(\/A) such 
that ay = b. 7 is called a relative generator of b w.r.t. a. 7 is unique up to 
multiplication by units. Equivalence of ideals is an equivalence relation. The 
equivalence classes are called ideal classes. If a = Oa then b = ’’jOa is called a 
principal ideal and 7 is simply called a generator oi b. The set of principal ideals 
is denoted by T’a and forms an infinite abelian subgroup of I a ■ The factor group 
Cl{A) = Xa[Pa is a finite abelian group and is called the ideal class group of 

For two equivalent ideals a and b = 7a we define the distance between a and 
bby 



( 5 (a, b) 




7 

7 



(mod TZa), 



( 1 ) 



where 7 = (x — y\/~A)!z denotes the (real) conjugate of 7 = (x -I- y\/~A)!z. If 
a = Oa, we simply write 5(b) instead of 5{Oa, b). 

Given an ideal 21 we denote by p() (see e.g. [13, REDUCE_REAL, Algorithm 
2.6]) the reduction operator, which computes an equivalent reduced ideal a = 
p(2t). We denote (n > I) successive applications of p() by p”(). If a is a reduced 
ideal then p"(a) is also reduced and there is some I G Z>i, such that a = p^{a). 
In [22] it is shown that, for arbitrary A, the smallest such I may be as large as 
0(v/ZloglogA). 

Let a = (a, 6) be a reduced ideal. Then one may use [13, Algorithm 2.11] 
to compute the right neighbour a_|- = (a+,6_|_) = p(a), where 5(a, a+) = i 



log 



b^+\/A 



b+ — -v/3 



and [13, Algorithm 2.12] respectively to compute the left neighbour 



a_ = (a-,b_) = p ^(a), where 5 (a, a_) = — Mog 



b-+^/A 

b^-VA 



of a. 
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It is easy to see that traveling around a complete circle and obtaining p\a) = 
a and thus a(e) = a yields 



5(p'(a),a) 




^log 



7V(e) 



1 

2 



log(e^) = loge 



TZa- 



Instead of this naive strategy, which is obviously only applicable for very 
small Z\, Shanks [21] made use of the infrastructure of the principal class of a 
real quadratic order to compute TZa- In following we will recall the most 
important properties of this infrastructure. 

In [15] it was shown that 

log (^l/yZ+ 1^ < |(5(a, a+)| < logTZ, (2) 



and for three consecutive ideals a, a+, a++ we have 



|i5(a, a++)| > log2. 



(3) 



Moreover, it is immediate from the definition that for three equivalent ideals 
a, b, c we have 

(5(a, b) + 5(b, c) = <5(a, c) (mod 7^^) (4) 

and for two pairs (a, b) and (c, b) of principal ideals 

b(ac, bb) = b(a, c) + (5(b, b) (mod 7?.zi). (5) 

The last assertions are not surprising, as the set of (invertible) principal 
ideals T’a forms a group under multiplication. On the other hand it is easy to 
see that the subset of reduced principal ideals, together with some combination of 
multiplication and reduction, does not form a group, because such an operation 
is either not associative or not closed. 

However it is possible to show that the operation 0, defined by multiplication 
followed by reduction, is “close to” being a group operation: 

Proposition 1. Let a, b be reduced ideals and c = a0 b = p(ab). Then 

|(5(c,ab)| = \S{c) - (b(a) + b(b))| < 21ogZ\. (6) 

Proof. See [8, Proposition 5.8.4]. 

We will see that this deficiency w.r.t. a group operation can be repaired, 
without applying the space consuming power product representation from [6]. 
In the multiplication procedure for principal ideals in CRIAD-representation 
(CRIADmult, Algorithm 2) we will additionally compute rational approxima- 
tions for distances, which allow to correct the “error” introduced by reduction. 
We will see in Corollary 1, that this strategy makes the operation CRIADmult 
(essentially) associative, if one uses a sufficiently high approximation precision. 



292 



Detlef Hiihnlein and Sachar Paulus 



3 Rational Appromations of Real Numbers 

The distances between equivalent ideals, as defined in (1), are of the form 
l/ 21 og| 7 / 7 |, for 7 € Q(\/A)*, and hence in general irrational numbers. As 
the most practical way to handle these distances seems to be the computation of 
~ sufficiently accurate - rational approximations, we will follow [5,17] imd carry 
together the necessary definitions and properties of floating point numbers. 

Let r G M, r 0. Then we define b{r) = [log 2 jrjj + 1 and 6(0) = 0. 

Definition 1. A floating point number is a pair / = (m, e), where m, e G Z, 
m 0 or m = 0 and e = 0. m is called the mantissa and e is called the exponent 
of f ■ / = e) represents the rational number q = m ■ 

Definition 2. Let r G M and k £ Z,j G Q>o- 

1 . An absolute /c-approximation for r is a floating point number f = (m,e), 
such that 1/ — rj < 2~^ and e > b{m) — k — 1. 

2. An absolute (j, /c)-approximation for r is a floating point number f = (m, e), 
such that 1/ — r| < ^ and e > b{m) — k + \log 2 {jf\ — 1- 

Remark 1. The latter definition is necessary to consider the round-off-error in 
some computation more closely. Here we will shortly relate a {j, fc)-approximation 
to an /-approximation. 

Let / = (to, e), where e > 6 (to) — k + [/og 2 (j)l — 1, be an absolute (j, fc)- 
approximation for r G R. Then [/ — r| < ^ and 

one immediately sees that / is precisely an absolute /-approximation for / = 
k — [log 2 (j)]. On the other side it is clear from the definition that a (l,fc)- 
approximation is precisely a fc-approximation. 

We use Markus Maurer’s functions from the xbigfloat-class of LiDIA [16] to 
implement the necessary floating point arithmetic. A theoretical treatment of 
these functions may be found in [17]. Besides addition, subtraction and the com- 
parison of floating point numbers we will also need the following two functions: 

— Trunc(/, k) 

denotes the LiDIA-function Truncate(/, k) and returns the “fc significant 
bits” of a floating point number / = (to, e) = to • 2 ®“^^’") by deleting the 
last 6 (to) — k last bits of the mantissa to. To prove a bound for the necessary 
precision to make our proposed CRIAD-representation unique, we will make 
use of [17, Lemma 2.3.1], which states that given a fc-approximation / = 
(to, e) of a real number r, /' = Trunc(/, k + e) is a fc — 1-approximation of r. 

- qlog(a;,?/,fc) 

denotes the LiDIA-function a.absolute_Ln_approximation(fc) and returns 
on input of a number 7 = a; -I- y'/A a fc-approximation of Lenstra’s [15] 
logarithm Log( 7 ) = I/ 2 I 7 / 7 I. A thorough description of this function may 
be found in [17, Section 6.1.4]. 
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4 Bounds for the Uniqueness 
of the CRIAD-Representation 

In this section we introduce the CRIAD-representation of principal ideals, as 
sketched in [2]. After a formal specification of the required properties we will 
derive bounds for the involved precision to make this representation unique. 

To make the line of thought leading to this important representation more 
transparent, we will start with defining a RIAD-representation, where we do not 
require that the reduced ideal in this representation is close to the represented 
ideal. 

Definition 3. Let % he a (fraetional) principal OA~ideal, / = {rn,e) a floating 
point number and k € Q>o- 

A {j,k)-RIAD-representation of ^ is a pair (o, /), where a is an arbitrary 
reduced principal ideal and f is an absolute (j, k)- approximation for the distance 
<5(21, a) between 2t and a. A (1, k)-RIAD-representation is simply called k-RIAD- 
representation. 

Given a, not necessarily reduced, principal ideal 21 in standard representation 
it is easy to compute a fc-RIAD-representation (o, /) for it. The procedure (a, /) = 
Std2RIAD(2t, fc) uses the standard LiDIA-routine ( 0 , 7 ) = REDUCE(2l) (see e.g. 
[13, REDUCE_REAL, Algorithm 2.6]), which computes a = p(2l) and the relative 
generator 7 = (x + y\fA)jz = 2 l/a, and computes / = — qlog(x, y, k). 

As the reduced ideal a in this representation will, for example, be used to 
derive a common key in a Difhe-Hellman key agreement procedure, it is espe- 
cially important to guarantee that both communication partners end up with the 
same reduced ideal a, representing 21 = while performing entirely different 
computations. 

As there are no further requirements for the reduced ideal a in the RIAD- 
representation (a,/) of a principal ideal 2 t and there are c = 0{VAloglogA) 
reduced ideals in the principal cycle, there are obviously c different RIAD-repre- 
sentations for 21. Among all these RIAD-representations for 21 we will now elect 
the CRIAD-representation, which will be shown to be uniquely determined if the 
involved precision is sufhently high. If the precision would be too low, such that 
there would be two valid CRIAD-representations (ai,/i), i G {1,2}, Oi yf 02 , for 
an ideal 21 = then a key agreement procedure would entirely fail to work, or 
would need to be “repaired” using a second communication round, as proposed 
in [7,19]. 

Suppose for a moment, that the distances could be determined exactly. Then 
one could simply define the unique representative for 21 to be the reduced ideal 
a with the (in absolute value) smallest distance and possibly - if 21 is precisely 
inbetween two reduced ideals - positive distance. In this case it is clear that a 
is uniquely determined. 

However, since we are dealing with rational approximations, i.e. we only have 
k < 00 many correct bits of the distances at our disposal, some more consid- 
erations are necessary to make the reduced ideal in the CRIAD-representation 
unique. 
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Suppose, that all computations were performed with a sufficiently high preci- 
sion k, that we can guarantee, that the reduced ideal a in the RIAD-representation 
(a, /) is either the left or right neighbour of 21 and recall that the function 
Trunc(g, Z) returns (only) the I significant bits of a floating point number g. 

Let (oi,/i), with /i = (mi,ei), and (02,72), with fi = (mi,ei), be two 
canditates for the CRIAD-representation. If |Trunc(/i, fc -|- ei)| < |Trunc(/2, fc -I- 
62)1, then the decision is easy and we choose (ai,/i) to be the unique CRIAD- 
representation. 

In the worst case 21 is - in the scope of the fixed approximation precision - pre- 
cisely inbetween the two reduced ideals Oi and 02- Then the RIAD-representations 
(ai,/i) and (02,72) have the following properties: 

For the distances we have 

|,5(2t,oi)-7i| <2-^= and |<5(2t, 02) - 72I < 2-^ (7) 

since we have fc-RIAD-representations and 

Trunc(7i, fc -I- ei) = -Trunc(72, fc + £2), (8) 

since 21 is - in the scope of the fixed approximation precision - precisely inbe- 
tween Oi and 02. 

We assume w.l.o.g. that fi >0 and choose (oi, 7i) to be the unique CRIAD- 
representation for 21, even if the precise distances may satisfy 

1^(21, oi)| > |<5(21,02)|. 

Thus the reduced ideal 0 in the CRIAD-representation (o, 7) is not necessar- 
ily the reduced ideal closest to 21, but nevertheless uniquely determined, if the 
approximation precision, as part of the system parameters, is sufficiently high. 

Now we will bring the above vague ideas in a more formal shape, such that 
we will be able to determine the necessary precision in order to guarantee that 
the reduced ideal a in a CRIAD-representation is uniquely determined. 

Definition 4. Let k £ Z,j £ Q>o and I = k — [log2(7)]. Then a {j,k)-CRIAD- 
representation o7 21 is defined to be a (j, k)-RIAD-representation (a,f), where 
7 = {m,e), ofQl, satisfying the following properties: 

1. I Trunc(7, Z-l-e)| < \ Trunc{f ,l+e')\ for all {j,k)-RIAD-representations {a\ f), 
where f = {m',e'), o72l and 

2. if(ai,fi) and (u2,72) are two (j, k)-RIAD-representations, which satisfy ( 1 .), 
where fi >0 and 72 < 0, then (a, 7) = (cii!7i)- 

If 21 is reduced, then we call (2t, 0) a (0, Zc)-CRIAD-representation for any 
fc £ Z. A (1, fc)-CRIAD-representation is simply called a fc-CRIAD-representation. 

Definition 5. A {j,k)-CRIAD-representation (a, 7) is called unique, if there is 
no I = k — \\og2{jY\-CRIAD-representation (a',f'), a' 7^ a, for a given ideal 21, 
which satisfies ( 1 .) and possibly ( 2 .) in above definition. 
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It remains to determine a bound for the precision to make such a CRIAD- 
representation unique. For this purpose we will proceed in two steps. In Lemma 
1 we will present a different formulation of the uniqueness-problem for CRIAD- 
representations. We will show that this representation is unique if in a certain 
real, open, interval of width 2“^+riog2(f)l+2 there is only one reduced ideal. In 
Proposition 2 we will derive a bound such that in an interval of said width there 
can be only one reduced ideal. 

Lemma 1 . Let (ai,/i), where f\ = (mi,ei), he a {j,k)-CRIAD-representation 
of a principal ideal 21, I = k — |"log2(j)] and f = Trunc{fi, I + Ci). This CRIAD- 
representation is unique, if there is no reduced ideal tt2 ai, such that 

S{a2) G ]<5(2l) - / - 2-'+i, 5(21) - / + 

Proof. Let I = k — |"log2(j)]. Considering the above definition, we see that the 
(j, fc)-CRIAD-representation (ai,/i), with fi = (mi,ei), for a principal ideal 21 
is not unique if there is (at least) one other (j, fc)-CRIAD-representation (a2, /2), 
with /2 = {m2, €2), for 21, such that U2 yf Oi and / = Trunc(/i,^ -|- Ci) = 
Trunc(/2,; -h 62). 

Considering the involved distances, we have 

|5(2l,a,)-/,| <j/2"<2-', zG{l,2}, 

as (oi,/i) are (j, /c)-CRIAD-representations. 

By [17, Lemma 2.3.1] we loose one bit of precision by computing 

/ = Trunc(/i, 1 -h ei) = Trunc(/2 , 1 + 62). 

I.e. as fi, i G {1,2}, are absolute 1-approximations for 5(21, a^) we know that / 
is an absolute I — 1-approximation for 5(21, a, ) and we obtain 

[5(21, a,) - /I = |5(21) - / - 5(a,)| < 2 ~‘+\ i G (1, 2}. 

This shows that non-uniqueness occurs, if there is some reduced ideal 02 y^ Ui, 
such that 

5(u2) g ]5(21)-/-2-'+i,5(21)-/ + 2-'+i[. □ 

Looking back to our original argumentation. Lemma 1 shows that non- 
uniqueness occurs, if there is a reduced ideal a', such that the CRIAD-repre- 
sentation (a', f) satisfies all requirements in the definition, but a' is not the 
direct left or right neighbour of 21. 

To derive a bound for the uniqueness of the CRIAD-representation, it is - by 
Lemma 1 - sufficient to investigate, whether in a real, open interval of width 
2-fc+riog2(i)l+2 ti^ej-e may be two (or more) reduced ideals. 

Proposition 2. Let (a,/) he a {j,k)-CRIAD-representation of a principal Oa~ 
ideal 21, I = k — [log2(j)] and 

X{A) = - log2 (log [i/^TA -k 1) ) -k 2. 

Then the {j, k) -CRIAD-representation (a,/) of% is unique, if I > x{A). 



(9) 
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Proof. Let I = k — |"log 2 (j)]. Then - by Lemma 1 - it is sufficient to explore, 
whether two neighbouring, reduced ideals a, a+ may lie in an open interval of 
width 2“*+^. By (2) we have 

|(5(a, a+)| > log(l/v^+ 1). (10) 

We have uniqueness, if it is impossible that two neighbouring ideals lie in the 
interval of width 2“*+^. By (10) we have 2“'+^ < \og{\/\/~A + 1) < |^(a, a+)| 
and therefore 

^ > -log2(log(l/\/^+ 1)) +2 = x(Z\). 



Remark 2. During the execution of a cryptographic protocol one needs to take 
care that one remains above this minimum precision. 

For a; > 1 we have log(l/x + 1) > l/(x + 1) and one obtains the bound 

X(Z1) <log2(v/Z+l) + 2. (11) 

Thus it is sufficient, that - at the end of any cryptographic protocol - about 
log2(/i)/2 correct bits of the distances are at our disposal. Note that the cursory 
“analysis” in [2] suggests a minimum precision of log 2 ( 2 \) bits. 

5 CRIAD-Exponentiation Using Power Products 

In this section we will show what precision is sufficient in an exponentiation 
routine for ideals in CRIAD-representation using power products, as in [2]. As 
above, our analysis will reveal that the precision bounds given in [2] are way too 
high. 

We will use the LiDIA-function (b,d) = CLOSE(a, t, fc) which on input of 
a reduced ideal a, a rational number t and an approximation precision k will 
return a reduced ideal b such that 5(b) is - with respect to k - close to 5(a) + t 
and a fc-approximation d to Log(a/b). A detailed description of this function can 
be found in [17, Section 8.4]. Note that the functionality of CLOSE is equal to 
the functionality of the procedure TARGET in [2]. 

Let n G Z>o and I = [log 2 (n)J . Then (n;, • • • , no) is the binary expansion of 
n = X)i=o where n^ G {0, 1} for 0 < i < / — 1, ni = 1. 

Proof. Recall that in any call (a, a) = REDUCE(2t) we have a = 2l/a. 

Thus we obtain at the end of the for-loop a” = ()7 and compute the floating 
point number t such that 5(f)) + t is close to 5(21”). Therefore the correctness, 
disregarding the approximation precision, follows from the correctness of CLOSE 
[17, Section 8.4]. 

It remains to show the correctness of the J-value. Let 9 = 2t"/t). Then 
we have |t — Log(0)| < nj2~^ + < (2^+^ — \)j2~^ + and at 

the very end |2l" — (5(f) + ft.)| < (2*+^ — \)j2~^ + 3 • 2“^^+^^ which shows that 
J = (2*+^ — l)j + 3 • 2~^ is correct. □ 
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Algorithm 1 CRIADexpPP 

Require: The (j, A:)-CRIAD-representation (a, /) of a principal O/i-ideal 21, the 
exponent n = {ni, ■ ■ ■ ,riQ) G ^>o, the final precision k and the additional 
precision z > 0. 

Ensure: The ( J, fc)-CRIAD-representation (b, g) of 21", where J = (2*+^ — l)j + 
3-2-^ 

7 = 1 

f) = o 

for i = I — I downto 0 do 
^ REDUCE([]2) 

7 7^0 

if rij = 1 then 

(f),a) ^ REDUCE(li ■ a) 

7 •<— ja 

end if 
end for 

f -n- qlog( 7 ,fc + z) 

(fj,A) ^ CLOSE(f),t,fc + z) 
h <— t — h 
return(f), h) 



To allow a fair comparison of the different exponentiation strategies we need 
to consider their behaviour within some cryptographic protocol. We will only 
give bounds for the Diffie-Hellman key-agreement-protocol here and treat more 
sophisticated protocols in the final paper [12]. 

Proposition 3. Let a,b < 2^+^ be the secret exponents in a Diffie-Hellman 
key-agreement protocol using CRIADexpPP and a reduced ideal as common base. 
Then it is sufficient if both partners use k > x(^) o-nd z > I -\- 3. 

Proof. As both partners start with a reduced ideal, i.e. a (0, fc)-CRIAD-repre- 
sentation, they obtain a (ji,fc) CRIAD-representation with the first exponentia- 
tion, where ji = 3-2“^. The second exponentiation yields a (j 2 , fc)-representation, 
where j 2 = (2*+^ — l)ji + 3- 2“^ = 3-2^+^“^ < This is a fc-approximation, 

if z > I -\- 3. □ 

This proposition explains the required precision of 512 -1-2-1- 160 -I- 3 = 677 
bits, for 1024 bit A and 160 bit exponents, stated in the introduction. 

6 CRI AD- Arithmetic without Power Products 

Regardless of the applied exponentiation technique it is necessary to have the 
procedure CRIADmult - and possibly CRIADinv - available to implement more 
sophisticated - e.g. signature - protcols. Therefore we will develop this basic 
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arithmetic for principal ideals in CRIAD-representation. We will show in Corol- 
lary 1, that CRIADmult is (essentially) a group operation. Thus one may const- 
struct exponentiation techniques based on this procedure which do not require 
the power product representation and consequently can be applied in environ- 
ments with limited RAM. 

In our presentation of CRIADmult we will need a procedure (b,g) = RIAD2 
CRIAD((a, /), z) which uses right- or leftsteps to convert a given (j, fc)-RIAD- 
representation approximation into the (jTl, fc)-CRIAD-representation. One may^ 
use the procedure LOCAL_CLOSE [17, Section 8.2] for this purpose. 



Algorithm 2 CRIADmult 

Require: The (ja, fc)-CRIAD-representation (a, a) of a principal ideal 21, the 
(jb, fc)-CRIAD-representation of a principal ideal the final precision k and 
an additional precision 2 : G ^>0, where k > x(A) -|- log 2 (ja + jb + 2“^). 

Ensure: The unique {ja + jb + 2“^, fc)-CRIAD-representation (c, c) of 2l*B. 
p <— k + z + 1 
(t), h) •<— Std2RIAD(a • b,p) 

(c, c) •<— RIAD2CRIAD(([}, h + a + b),p) 
return(c, c) 



Proof. The proof will appear in the full paper [12]. □ 

Now it is easy to see that this operation is (essentially) associative, provided 
that the approximation precision is chosen to be sufficiently large. 

Corollary 1. Let z > 0 and (a, a), (b, 6 ), (c, c) be the unique (ja,k),{jb,k), 
(jc, k)-CRIAD-representations for the principal ideals 21, *8, C, J = ja + jb + jc + 
2“^+^, where k > x(Z\)-|- ]"log 2 ( J)] . Let (Oi, di) = CRIADmult{CRIADmult{{a, a), 
{b,b),z), (c, c),z), where di = (mi,ei) and ( 52 ,^ 2 ) = CRIADmult{{a,a), 
CRIADmult{{b,b), (c,c),z),z), where d^ = {m2, €2). Thenbi = t >2 and Trunc{d\, 
k + e\) = Trunc{d2, k + 62). 

Proof. See [12]. □ 

Remark 3. Since we have chosen Lenstra’s distance Log( 7 ) = l/21og|7/7|, as 
proposed in [15], instead of Shanks’ naive distance log( 7 ) [21], it is easy to see 
that the inversion of a principal ideal in CRIAD-representation is essentially free 
of cost and especially does not impose any round-off-errors. If ((a, &),/) is a 
(j, fc)-CRIAD-representation of 21, then {{a,—b),—f) = CRIADinv((a, b), /) is a 

^ Note that LOCAL_CLOSE makes use of (small) power products. Due to space restric- 
tions we need to refer to the final paper [12] for our method RIAD2CRIAD, which 
avoids power products at the cost of a slightly higher internal precision 
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(j, fc)-RIAD^-representation of 21 This fact will be used to construct signed 
digit exponentiation routines, which are slightly more efficient. 

7 CRIAD-Exponentiation without Using Power Products 

As CRIADmult essentially behaves like a group operation it is straightforward to 
construct - more sophisticated ~ exponentiation routines for principal ideals in 
C R I A D-representation. 

Due to space restrictions we will only present the exponentiation routine 
based on the classical binary square and multiply strategy in detail here. For 
the more sophisticated exponentiation routines we will only present the results 
of the precision analysis; the corresponding proofs appear in the full paper [12]. 



Algorithm 3 CRIADexp 


Require: The (j, A:)-CRIAD-representation (a, /) of a principal O/i-ideal 21, the 
exponent n = G ^> 0 i the final precision k and the additional 


precision z > 0. 

Ensure: The ( J, fc)-CRIAD-representation (b,p) of 21” 
2-^(2'+i - 2). 


\ where J = (2*+^ — l)j 


^ (a,/) 

for i = 1 — 1 downto 0 do 




([), h) ^ CRIADmult((f), h), (f^, h),k, z) 
if rij = 1 then 

(f), /i) •<— CRIADmult(((), h), (a, f),k,z) 

end if 




end for 
return(f), h) 





Remark 4- It should be noted that the presented square and multiply strategy 
is the so called “left-to-right” variant. This is important, because it features less 
error propagation for reduced ideals than the “right-to-left” variant [14], while 
the number of group operations is the same. This is yet another point, where 
the precision stated in [2] can be easily improved. 

Proposition 4. Let a,b < be the secret exponents in a Diffie-Hellman key- 
agreement protocol using CRIADexp and a reduced ideal as common base. Then 
it is sujficient if both partners use k > x(A) and z > 21 -\- 2. 

In a similar manner we obtain the following bounds for more sophisticated 
exponentiation techniques; for the proofs we need to refer to the final paper [12]. 

® It should be noted that, if the ideal is precisely inbetween two reduced ideals, another 
right step might be necessary to obtain the CRIAD-representation 
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Proposition 5. On input of a (j, k)-CRIAD-representation and the exponent n 
with I = [log 2 (n)J, the sliding m-bit window method (see e.g. [9]) produces a 
(J, k)-CRIAD-representation, where J = 2^+^+^j + 

Let a,b < be the secret exponents in a Diffie-Hellman key-agreement 

protocol using CRIADexpwindow and a reduced ideal as common base. Then it is 
sufficient if both partners use k > x(Z\) and z > 2(1 m) 3. 



Proposition 6. On input of a (j, k)-CRIAD-representation, the appropriate pre- 
computed values and the exponent n with I = [log 2 (n)J, the signed 2^^ -digit ver- 
sion of the BGMW exponentiation method [f] produces a {J^k)-CR!AD-represen- 
tation, where J = 2^+^+^j + 2^+"*+32“^. 

Let a,b < 2^~^^ be the secret exponents in a Diffie-Hellman key-agreement 
protocol using CRIADexpBGMW and a reduced ideal as common base. Then it is 
sufficient if both partners use k > x(Z\) and z > 2(1 + m) + 6. 

If the first exponentiation is performed with CRIADexpBGMW and the second 
exponentiation is performed with CRIADexpwindow, which might be often used in 
practice, then it is sufficient if both partners use k > x(Z\) and z > 2(1 + m) + 5. 



8 Timings of a First Implementation 

In this section we provide timings of a first implementation using the different 
exponentiation methods discussed in this work. For the sake of comparison, we 
also provide timings of an implementation of the procedure EXP as given in [2]. 
The timings in Table 1 are given in seconds and correspond to randomly chosen 
discriminants and exponents of the respective bit length on a Pentium II with 
166 MHz using LiDIA 2.0 [16]. As precision we used the necessary precision 
for the Diffie-Hellman protocol as given in the Propositions 3, 4, 5, 6 and [2] 
respectively. 

The results for 500 bit discriminants and 100 bit exponents, which are closest 
to real world requirements, indicate that an exponentiation with CRIADexp is 
more than twice as fast as the exponentiation routine EXP from [2] and can - 
using precomputation - be accelerated to obtain a more than ten times faster 
exponentiation. Moreover our approach without applying power products seems 
to save not only a considerable amount of space, but also up to 30% time. 
Because the relative speedup tends to increase with higher parameters one might 
conjecture that our method is also asymptotically preferable. This issue will be 
discussed in the full paper [12]. 
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Table 1. Timings for different CRIAD-exponentiations 



Bitlength 


Bitlength 






Exponentiation method 




of 


of 


EXP 


CRIADexpPP 


CRIADexp 


CRIADexpwindow 


CRIADexpBGMW 


Exponent 


Discriminant 


see [2] 


see Prop. 3 


see Prop. 4 


see Prop. 5 


see Prop. 6 


20 


100 


0.50 


0.38 


0.33 


0.44 


0.11 


40 


100 


1.10 


0.94 


0.88 


0.93 


0.28 


60 


100 


2.14 


1.82 


1.64 


1.71 


0.39 


80 


100 


3.52 


2.85 


2.75 


2.68 


0.60 


100 


100 


5.71 


4.18 


4.12 


3.79 


0.82 


20 


200 


0.99 


0.77 


0.60 


0.66 


0.17 


40 


200 


2.09 


1.70 


1.37 


1.49 


0.27 


60 


200 


3.68 


2.91 


2.47 


2.47 


0.55 


80 


200 


5.66 


4.23 


3.90 


3.68 


0.72 


100 


200 


8.68 


5.99 


5.66 


5.21 


1.10 


20 


300 


1.76 


0.87 


0.83 


0.99 


0.27 


40 


300 


3.35 


1.92 


1.93 


1.81 


0.44 


60 


300 


5.33 


3.35 


3.18 


2.91 


0.82 


80 


300 


8.13 


5.00 


4.83 


4.23 


1.04 


100 


300 


11.53 


7.91 


6.76 


6.37 


1.48 


20 


400 


1.76 


1.48 


1.10 


1.32 


0.32 


40 


400 


3.79 


3.07 


2.36 


2.59 


0.55 


60 


400 


6.64 


4.89 


3.96 


4.06 


0.88 


80 


400 


10.10 


7.25 


5.99 


5.88 


1.20 


100 


400 


14.88 


10.00 


8.73 


8.35 


1.70 


20 


500 


3.35 


2.58 


1.38 


1.43 


0.43 


40 


500 


7.14 


4.61 


2.97 


2.91 


0.66 


60 


500 


10.88 


7.30 


5.06 


4.61 


1.15 


80 


500 


15.98 


10.27 


7.75 


6.96 


1.65 


100 


500 


22.52 


13.84 


10.44 


9.66 


2.09 
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Abstract. In this paper, we first show that there are several equivalent 
keys for t + 1 chosen plaintexts if the degree of the reduced cipher is 
t — 1. This is against the claim by Jakobsen and Knudsen. We also derive 
an upper bound on the number of equivalent last round keys for t + 1 
chosen plaintexts. We further show an efficient method which finds all 
the equivalent keys by using Rabin’s root finding algorithm. We call our 
attack root finding interpolation attack 

Keywords: Block cipher, interpolation attack, root finding algorithm, 
resultant. 



1 Introduction 

Consider a Feistel type block cipher of block size 2n with a round function 
F{K,x). For a fixed key K, F{K,x) can be viewed as a polynomial /ic(x) in 
X over GF(2”). The interpolation attack [4] succeeds if deg fK{x) is small for 
any K and the number of rounds is not large. More precisely, suppose that the 
degree of the reduced cipher is t — 1, where the degree of the reduced cipher will 
be defined in Definition 2.1. Then 

1. Jakobsen and Knudsen claimed that the last round key K^n can be recovered 
from t + 1 chosen plaintexts (see [4, Theorem 3]). 

2. They used exhaustive search to find Km- 

On the other hand, given a polynomial f{x) over GF(p), Berlekamp proposed 
a probabilistic algorithm of finding a root a £ GF(p) of j{x) = 0 for any odd 
prime p [1]. Rabin generalized Berlekamp’s algorithm to any finite fields [8]. 
In Rabin’s algorithm, the expected number of bit operations to find a root of 
fix) = 0 over GF(2”) is 

0{n^dL{d)L{n)), 

where d = deg/(a;) and L{n) = logn x log log n. 

In this paper, we first show that for t + 1 chosen plaintexts, there are several 
equivalent keys. This is against the claim by Jakobsen and Knudsen [4, Theorem 
3] . We also derive an upper bound on the number of equivalent last round keys 
for t+1 chosen plaintexts. 



D.R. Stinson and S. Tavares (Eds.): SAC 2000, LNCS 2012, pp. 303—314, 2001. 
(c) Springer-Verlag Berlin Heidelberg 2001 
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We next show an efficient method which finds all the equivalent last round 
keys Km- We call our attack root finding interpolation attack because it uses 
Rabin’s root finding algorithm [8]. By using more than t+1 chosen plaintexts, 
we can uniquely determine Km- 

Further, Jakobsen and Knudsen claimed that the number of necessary chosen 
plaintexts can be smaller than f + 1 if they use the meet in the middle approach 
[4]. However, the number of equivalent keys increases if the number of chosen 
plaintexts decreases in general. Therefore, their claim cannot be justified. For 
this problem, we derive another upper bound on the number of equivalent last 
round keys for a certain number of chosen plaintexts which is less than t+1. 

Related works: Youssef and Gong studied the effect of the choice of the irreducible 
polynomial defining GF(2") on deg /if (a;) and whether or not there exists a 
simple linear transformation on the input or output bits such that the resulting 
polynomial has a less degree [9]. 

The higher differential attack succeeds if the round function F{K,x) can be 
expressed as a set of low degree Boolean functions [4] . Moriai et al. showed an 
improved higher differential attack for a 5 rounds GAST cipher in which Km is 
computed by solving simultaneous linear equations [6] . 

2 Preliminaries 

2.1 Notation 

Gonsider an m round Feistel type block cipher with block size 2n. Let x = 
{xl,xr) denote the plaintext, where xl = {x\,- . . ,x„) and xr = (xn+i, . . . ,X 2 n)- 
Similarly, let y = {yL,yR) denote the ciphertext. Let 

Cq = XL and Cq = xr . 

The round function F operates as follows. 

) '^i — ^i-1 ’ ('ll 

where K^ denotes the +th round key. The ciphertext y = {yL^yR) is given by 
(C«,C/;). See Fig. 1. 

2.2 Reduced Cipher Assumption 

We say that: 

1- {Cm-i,Cm-i) is the reduced ciphertext and 
2. {Cm_ 2 ,Cm_ 2 ) is the second reduced ciphertext, respectively. 

Define 

y={yL,yR) = {C^-i.C^-i) 

y={yL,m) = {c^-2,cL-2)- 

See Fig. 1. 
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Co = XL Co = XR 




C L /^R 

m— 3 ^m— 3 






C L ^R 

m 



Fig. 1. The m round Feistel cipher 



Definition 2.1. Fix the right half of a plaintext xr as xr = 0. Fix the key of 
the block cipher arbitrarily. Then we say that: 

1. A block cipher satisfies the reduced cipher assumption of degree t — 1 if the 
right half yR of the reduced ciphertext can be expressed as 

yR = fi{xL) ( 2 ) 

for some polynomial fi{x) over GF(2'^) such that deg fi{x) <t — 1. 

2. A block cipher satisfies the second reduced cipher assumption of degree it — 1 
if the right half yR of the second reduced ciphertext can be expressed as 

Vr = /2(xl) 

for some polynomial f 2 {x) over GF(2") such that deg/ 2 (x) < it — 1. 



( 3 ) 
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2.3 Lagrange Interpolation 



Let <5 be a field. Given 2t elements x\, . . . ■ ■ ■ ,yt S Q, where the XiS are 

distinct. Define 

t 

fi^) = ( 4 ) 



where 



Ai(a:) 



n 



X — Xj 
Xi — Xj 



Then /(x) is the only polynomial over Q of degree at most t — 1 such that 
/(xi) = yi for i = 1, . . . , t. Eq.(4) is known as the Lagrange interpolation formula. 



3 Root Finding Algorithm over GF(2^) 

Given a polynomial h{x) of degree d with coefficients in GF(2”), Rabin showed 
an efficient probabilistic polynomial time algorithm which computes a root of 
h{x) = 0 in GF(2”) if such a root does exist [8]. 

First compute 

hi{x) = gcd(/i(x),x^ — 1). 

If hi{x) = 1, then h{x) has no roots in GF(2"). In general, 

hi(x) = (x — ai) • • • (x — Qffc), k < d, 

where the are the pairwise different roots in GF(2") of h{x) = 0. 

On the other hand, the trace function is defined as 

Tr(x) = X + x^ + h x^ 



For any a € GF(2"), it is known that 

Tr(a) = 0 or 1. 

Rabin first proved the following proposition 

Proposition 3.1. For any fixed Oi yf 02 € GF(2”), choose r € GF(2") ran- 
domly. Then 

Pr(Tr(rai) yf Tr(m 2 )) = 

Rabin next showed the following root finding algorithm. 

Let ho(x) = hi{x). 

Step 1. If degho(x) = 1, then we have found a root. Otherwise goto step 2. 
Step 2. Ghoose r G GF(2") randomly. Gompute 



hr(x) = gcd(/io(x),Tr(rx)). 
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Step 3. If hr{x) = 1 or ho{x), goto step 2. Otherwise, let 




hr{x) if deg hr ( 2 ^) < I deg ^ 0 ( 2 ;) 
ho{x)/hr{x) otherwise. 



Goto Step 1. 



From Proposition 3.1, it holds that 



Pr [0 < deg hr{x) < deg hi{x)] > 



Therefore, it can be shown that [8] the expected number of bit operations is 



4 Equivalent Keys 

and Root Finding Interpolation Attack 

In this section, we first show that for t + 1 chosen plaintexts, there are several 
equivalent keys. This is against the claim by Jakobsen and Knudsen [4, Theorem 
3] . We also derive an upper bound on the number of equivalent last round keys 
for t + 1 chosen plaintexts. 

We next show an efficient method which finds all the equivalent keys. We call 
our attack root finding interpolation attack because it uses Rabin’s root finding 
algorithm [8]. By using more than t + 1 chosen plaintexts, we can uniquely 
determine Km- 

For a plaintext {xl,xr) = (xi,0), let (yL,i,yR,i) denote the ciphertext, 
{yL,i,yR,i) denote the reduced ciphertext and {yL,i,yR,i) denote the second re- 
duced ciphertext. 

4.1 Key Equation 

In this subsection, we derive a key equation 

h{K^) = 0 

in such that deg h(K^) < d, where d is given below. 

Definition 4.1. We say that the round function F satisfies K polynomial as- 
sumption of degree d if for any fixed x , there exists a polynomial g^ withdegg^^K) 
< d such that 



0{n^dL{d)L{n)), 



where 



L{n) = logn X log log n. 



F{K,x)=g,{K). 
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Suppose that there exists a block cipher which satisfies the reduced cipher 
assumption of degree t — Further, without loss of generality, we can assume 
that there exists d such that the block cipher satisfies K polynomial assumption 
of degree d. 

First by using the Lagrange formula, fi{x) of eq.(2) can be expressed as 
fi{x) = Xi{x)fi{xi) H h At(a:)/i(a:t) 

for some polynomials Ai(a;), . . . , At(a;), where each Ai(a;) is determined by xi, 
. . . ,Xt- Then we have 

fi(xt+i) = Xi{xt+i)fi{xi) H h Xt(xt+i)fi{xt) 

for X = Xt+i- Substituting eq.(2) into the above equation yields that 

yR,t+i = Xi{xt+i)yR.i H h Xt{xt+i)yR,t (5) 

On the other hand, from eq.(l), it holds that 

VR,i = F{K^,yR,i) + yL,i- ( 6 ) 

Substitute eq.(6) into eq.(5). Then we have 
F{Km,yR,t+i) + yL,t+i 

= Xi{xt+l){F{Kjn, yR,l) + yL,l) H h Xt{xt+l){F{Kjn, yR,t) + yL,t) (7) 

The above equation is rearranged as 

h(K^) = 0 (8) 

for some polynomial of Km over GF(2") such that 

deg h{K) < d 

from K polynomial assumption of degree d. We call eq.(8) the key equation. (This 
equation is not redundant. That is, we cannot reduce degh(K'm).) 

4.2 Equivalent Keys 

Jakobsen and Knudsen claimed that the last round key Km can be recovered 
from t + 1 chosen plaintexts in [4, Theorem 3]. However, eq.(8) implies that there 
are d or less equivalent keys. Now we have proved the following theorem. 

Theorem 4.1. Suppose that there exists a block cipher which satisfies the re- 
duced cipher assumption of degree t — 1 and K polynomial assumption of degree 
d. Then for t + 1 chosen plaintexts, there are d or less equivalent last round keys. 

In fact, the interpolation attack must require more than t+1 chosen plaintexts 
to uniquely determine Km- If the round function F{K,x) is not algebraically 
constructed, the situation is worse because d is usually large. In this case, there 
are many equivalent keys for t + 1 chosen plaintexts and the interpolation attack 
will require many chosen plaintexts to uniquely determine Km- 
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4.3 Root Finding Interpolation Attack 

We propose an attack which efficiently finds all the equivalent keys Km by solv- 
ing eq.(8) by using Rabin’s algorithm of Sec. 3. By using more than t-|- 1 chosen 
plaintexts, we can uniquely determine Km- We call this attack root finding in- 
terpolation attack. 

First suppose that t 1 chosen plaintext /ciphertext pairs are available such 
that the plaintexts are (a;i,0), . . . , (a;t+i,0) and the ciphertexts are 
• ■ • , {yL,t-i-i,yR,t+i)- Then 

Step 1. Compute the coefficients of h{Krn) of eq.(8) from xi, . . . ,Xt+i and 

{yL.i,yR,i), • ■ ■ , Especially, Ai(x) is determined by xi, . . . ,Xt 

though the Lagrange interpolation formula. 

Step 2. Solve eq.(8) by using Rabin’s algorithm of Sec. 3. Then we obtain d or 
less equivalent keys Km- 

Next suppose that some extra (chosen plaintext, ciphertext) pairs are avail- 
able. Then the set of equivalent keys is made smaller and we can finally uniquely 
determine Km- An alternative way is as follows. Obtain two key equations 
hi{Km) = 0 for i = 1,2 from t 2 chosen plaintexts. Compute gcd(hi(/Fm), 
h 2 {Km))- If Km is uniquely determined from the gcd, then we have done. Oth- 
erwise, execute the same procedure for more chosen plaintexts. 



5 On the Meet in the Middle Approach 

Jakobsen and Knudsen also claimed that the number of necessary chosen plain- 
texts can be smaller than t -|- 1 if they use the meet in the middle approach 
[4]. However, the number of equivalent keys increases if the number of chosen 
plaintexts decreases in general. Therefore, their claim cannot be justified. 

In this section, we derive an upper bound on the number of equivalent last 
round keys for certain number of chosen plaintexts which is less than t -|- 1. 

Suppose that there exists a block cipher which satisfies the second reduced 
cipher assumption of degree u — 1 and K polynomial assumption of degree d- 
Then from u-\-2 chosen plaintexts, we first derive two equations on {Km-i, Km) 
such that 



Hi{Km-l,Km)=0, 

H2{Km-l,Km)=0- 

We next compute the resultant 

h{Km)=R{H,,H2) 

of Hi and H 2 which yields that deg h{Km) < 2d^. The above equation means 
that there are 2d^ or less equivalent last round keys for u-\-2 chosen plaintexts. 

Finally, we can find all the equivalent keys by solving h{Km) = 0 by the 
Rabin’s algorithm. 
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5.1 Resultant [10] 

Let 



d 

A{x) = aix 

2^0 
e 

B{x) = biX^ 

i=0 



be two polynomials over a field Q. 
Define 



R{A, B) = det 



o-d ad-i ■ ■■ ao 
ad ad-i ■ ■ ■ ao 

be be-i ■ ■ ■ bo 

be be-i ■ ■■ bo 



We say that R{A,B) is the resultant of A{x) and B{x). 

Proposition 5.1. A(x) and B{x) have a common root in Q if and only if 

R{A, B) = 0. 



5.2 Key Equation 

First by using the Lagrange formula, f^ix) of eq.(3) can be expressed as 
f2{x) = 8i{x)f2{xi) H h 5ui.x)f2{Xu) 

for some polynomials lii(x), . . . , where each 5i{x) is determined by xi, 

Xu- Then we have 

f2(Xu+l) = 8i(Xu+l)f2(xi) H h 8u(Xu+l)f2(Xu) 

for X = x„+i. Substituting eq.(3) into the above equation yields that 

dE,u+i = 8i(xu+i)yB,i H h 8u(xu+i)yi{,u (9) 

On the other hand, from eq.(l), it holds that 

dRd = B(Km-i,yR,i) + VL,i 

— B^Bua—l^F^Rm^yR-i) T ljL.i) VR-i' 



Substitute eq.(lO) into eq.(9). Then we have 



( 10 ) 
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such that 



d 

dega^{Km) < 

from K polynomial assumption. 

Similarly for x = Xu+ 2 , we have 

H2{K^-uKm) = Q 

such that 

d 

H2{Krn-l,Km) =J2b^{Km)Kl^-l 

i^O 

degh{Km) < 

Finally, let 

h{K^) = R{H,,H2), 

where R{Hi,H 2 ) is the resultant of Hi and H 2 - From Proposition 5.1, it holds 
that 

h(K^) = 0 

since H\ and H 2 have a common root. Further, we can see that 

deg h{K) < 2d^ 



5.3 Equivalent Keys 

From the previous subsection, we obtain the following theorem. 

Theorem 5.1. Suppose that there exists a bloek cipher which satisfies the second 
reduced cipher assumption of degree u—1 and K polynomial assumption of degree 
d. Then for u + 2 chosen plaintexts, there are 2df or less equivalent last round 
keys. 

5.4 Root Finding Resultant Attack 

We propose an attack such as follows which we call root finding resultant attack. 

First suppose that u + 2 chosen plaintext/ciphertext pairs are available such 
that the plaintexts are (xi, 0), . . . , {xu+ 2 , 0) and the ciphertexts are (j/ip, 

• ■ • , {VL,u+ 2 ,yR,u+ 2 )- Then 

Step 1. Compute the coefficients of H\ and H 2 from x\,. . . ,Xu +2 and {yL,i,yR,i), 
■ • ■ ; {yL,u+2,yR,u+2)- 
Step 2. Compute h{Km) = R{Hi, H 2 ). 
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Step 3. Solve h{Km) = 0 by using Rabin’s algorithm of Sec. 3. Then we obtain 
2d? or less equivalent keys Km- 

Next suppose that some extra (chosen plaintext, ciphertext) pairs are avail- 
able. Then the set of equivalent keys is made smaller and we can finally uniquely 
determine Km- We can also have an alternative method similarly to Sec. 4. 3. 

6 Example 

The m round VUTZS cipher [4] is defined by letting 

F{K,x) = {K + xf 

over GF(2^^). 

Lemma 6.1. In eq-(4), 



Ai(a;) Xt{x) — 1. 

Proof- f{x) = 1 is the only polynomial over Q of degree at most t — 1 such that 
f{xi) = 1 for i = 1, - - - ,t- Therefore, from eq.(4), we have 

1 = Ai(a;) -l- • • • -l- Xt{x). 



Q.E.D. 

Corollary 6.1. The m round VUTZS cipher has two or less equivalent last round 
keys for 3™“^ -I- 2 chosen plaintexts. 

Proof. Let t — 1 = 3™“^ and d = 3. Then Theorem 4.1 tells us that there are 
d = 3 or less equivalent last round keys for t + 1 = 3™“^ -I- 2 chosen plaintexts. 
However, in this case, eq.(7) is written as follows. 

{Km + yR,t+if + yL,t+i 

= Xi{xt+i){{Km + yR,l)^ + yL.l) + • • • + Xt{xtj^i){{Km + yR.tY + yL,t)- 
By rearranging the above equation, we obtain the key equation h{Km) such that 

deg h{Km) = 2 

because the coefficient of Km is canceled from lemma 6.1. This implies that there 
are two or less equivalent last round keys. 

Q.E.D. 

The proposed attack computes all the equivalent keys Km by solving the 
quadratic equation h{Km) = 0 over GF(2^^). By using one more chosen plain- 
text, we can uniquely determine Km- 

On the contrary, Jakobsen and Knudsen claimed that the interpolation attack 
needs 3™“^ -I- 2 chosen plaintexts to recover Km- 
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Corollary 6.2. The m round VUTZS cipher has 54 or less equivalent last round 
keys for _|_ 3 chosen plaintexts. 

Proof. Let u — 1 = 3"*“^ and d = 3. Then Theorem 5.1 tells us that there are 
2df = 54 or less equivalent last round keys for u+2 = 3™“^+3 chosen plaintexts. 

Q.E.D. 

7 Summary 

In this paper, we first showed that for t + I chosen plaintexts, there are several 
equivalent last round keys if the degree of the reduced cipher is t — 1. This is 
against the claim by Jakobsen and Knudsen on interpolation attack [4, Theorem 
3] . We also derived an upper bound on the number of equivalent last round keys 
for t + 1 chosen plaintexts. 

We next showed an efficient method which finds all the equivalent last round 
keys Km. We call our attack root finding interpolation attack because it uses 
Rabin’s root finding algorithm [8]. By using more than t+1 chosen plaintexts, 
we can uniquely determine Km. 

The number of equivalent keys increases if the number of chosen plaintexts 
decreases in general. For this problem, we derived another upper bound on the 
number of equivalent last round keys for a certain number of chosen plaintexts 
which is less than t + 1 . 

As an example, we showed that the m round VUTZS cipher has two or less 
equivalent last round keys for 3'"“^+2 chosen plaintexts and 54 or less equivalent 
last round keys for 3™“^ _|_ 3 chosen plaintexts. The proposed attack efficiently 
computes all the equivalent keys Km by solving a key equation h{Km) = 0 
over GF(2^^) by using Rabin’s root finding algorithm. By using more chosen 
plaintext, we can uniquely determine Km. 

It will be interesting if we can extend our method to the probabilistic in- 
terpolation attack [3] which succeeds even if F{K, x) is approximated by a low 
degree polynomial. 
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Abstract. This paper studies the upper bounds of the maximum differ- 
ential and linear characteristic probabilities of Feistel ciphers with SPN 
round function. In the same way as for SPN ciphers, we consider the 
minimum number of differential and linear active s-boxes, which pro- 
vides a measure of the upper bounds of these probabilities, in order to 
evaluate the security against differential and linear cryptanalyses. The 
purpose of this work is to clarify the (lower bound of) minimum numbers 
of active s-boxes in some consecutive rounds of Feistel ciphers, i.e., in 
three, four, six, eight, and twelve consecutive rounds, using differential 
and linear branch numbers Vd, "Pi, respectively. Furthermore, we inves- 
tigate the necessary condition for desirable P-functions, which means 
that the round functions are invulnerable to both differential and linear 
cryptanalyses. As an example, we show the round function of Camellia, 
which satisfies the condition. 



1 Introduction and Motivation 

The best known attacks are differential cryptanalysis [6] proposed by Biham and 
Shamir and linear cryptanalysis [13] proposed by Matsui. Since these cryptanal- 
yses are the most powerful approaches known for attacking many symmetric 
block ciphers, designers should evaluate the security of any new proposed ci- 
phers against differential and linear cryptanalyses. To do this it is necessary to 
determine the maximum differential and linear probabilities by a useful (and 
acceptable) method. Feistel ciphers are commonly analyzed by (a) the upper 
bounds of the maximum average of differential and linear hull probabilities or 
(b) the maximum differential and linear characteristic probabilities. SPN ciphers, 
on the other hand, are commonly analyzed by (c) the upper bounds of the max- 
imum differential and linear characteristic probabilities. Recently, Hong et al. 
showed (a) the upper bounds of the maximum average of differential and linear 
hull probabilities of SPN ciphers [9] . 

With reference to method (a), Nyberg and Knudsen showed that the max- 
imum average of differential and linear hull probabilities for r-round (r > 4) 
Feistel ciphers are bounded by 2p^, 2q^ if the maximum differential and linear 
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probabilities of the round function are p, q, respectively^ [18]. They stated that 
Feistel ciphers are provably secure against differential and linear cryptanalyses 
if these probabilities are sufficiently low. This means that they are theoretically 
invulnerable to differential and linear cryptanalyses, since these probabilities 
are the upper bounds of the average of differential and linear hull probabilities. 
However, this approach has one fatal disadvantage. That is, these probabilities 
settle at some constant value even if the number of rounds increases. Therefore, 
a round function has to yield extremely low maximum differential and linear 
probabilities. This imposes a hard restriction on designing the round function. 
As a matter of fact, for a commercial cipher, MISTY [15] is provably secure with 
respect to differential and linear cryptanalyses. 

Method (b) has been used to estimate many (extended) Feistel ciphers such 
as DES [6,13] and FEAL [16,2]. Biham and Shamir claimed that the higher the 
differential characteristic probability is, the higher the success rate of differen- 
tial cryptanalysis is. This is because they exploited a single path between plain- 
texts and ciphertexts which holds significant differential characteristic probabil- 
ity. Matsui also claimed the same for linear cryptanalysis. Thus, Feistel ciphers 
are sufficiently secure against differential and linear cryptanalyses if these prob- 
abilities are less than the security threshold. Strictly speaking, however, these 
probabilities only give the lower bounds of the maximum average of differential 
and linear hull probabilities, since this method does not consider multiple paths 
between the same plaintexts and ciphertexts [12,17]. 

For SPN ciphers, Rijmen et al. introduced the branch number B [19]. The 
number B is the minimum number of active s-boxes in two consecutive rounds 
of a non-trivial differential characteristic or a non-trivial linear trail. Since each 
active s-box reduces the differential and linear characteristic probabilities, the 
number B provides the upper bounds of the maximum differential and linear 
characteristic probabilities in two consecutive rounds. The security against dif- 
ferential and linear cryptanalyses is evaluated by piling up the number B every 
two rounds. It is noted that Knudsen proposed a very similar concept for Feistel 
ciphers [10]. He noted that Feistel ciphers are practically secure against differ- 
ential and linear cryptanalyses if the upper bounds of the maximum differential 
and linear characteristic probabilities are less than the security threshold. 

It is obvious that the upper bounds of the maximum differential and linear 
characteristic probabilities by method (c) lie between the upper bounds of the 
maximum average of differential and linear hull probabilities by method (a) and 
the maximum differential and linear characteristic probabilities by method (b). 
Moreover, for most ciphers, the maximum averages of differential and linear hull 
probabilities, which provide the actual invulnerability to differential and linear 
cryptanalyses, are much lower than the upper bounds of these probabilities if 
the number of rounds increases. Therefore, it is worth investigating the upper 
bounds of the maximum differential and linear characteristic probabilities. 



^ Aoki and Ohta showed that these probabilities are bounded by if the round 

function is bijective and r > 3 [3] 
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Knudsen discussed the upper bounds of the maximum differential and lin- 
ear characteristic probabilities of general Feistel ciphers [10]. He showed that 
the upper bounds of these probabilities for 2r-round Feistel ciphers are p’’, if 
p, q are the maximum differential and linear probabilities of the round function, 
respectively^. His evaluation, unfortunately, did not take the interrelation be- 
tween input and output data in consecutive rounds into consideration. That is, 
it is not always useful to evaluate the upper bounds of the maximum differential 
and linear characteristic probabilities, if the maximum differential and linear 
probabilities of the round function p, q are relatively high while those of some 
consecutive rounds are (sufficiently) low, such as DES [7]. 

On the other hand, in this paper, we would like to focus attention on the 
upper bounds of the maximum differential and linear characteristic probabilities 
for Feistel ciphers with SPN round function. Like SPN ciphers, Feistel ciphers 
with SPN round function only consist of s-boxes and bitwise exclusive-ORs. This 
means that the (lower bound of) minimum number of active s-boxes determines 
the upper bounds of the maximum differential and linear characteristic probabil- 
ities for not only SPN ciphers but also Feistel ciphers with SPN round function. 
This evaluation takes the interrelation between input and output data in some 
consecutive rounds into consideration, while Knudsen’s evaluation doesn’t. Ac- 
cordingly, our motivation is to clarify the (lower bound of) minimum number of 
active s-boxes in some consecutive rounds of Feistel ciphers. 

This paper is organized as follows. Section 2 introduces some notations and 
definitions. Previous works are shown in Sect. 3. In Sect. 4 and Sect. 5, the lower 
bounds of the minimum number of active s-boxes for differential and linear 
cryptanalyses are given, respectively, i.e., the upper bounds of the maximum 
differential and linear characteristic probabilities. The necessary condition for 
desirable P-functions is discussed in Sect. 6. Finally, we conclude in Sect. 7. 

2 Preliminaries 

2.1 Notations 

X = (xi, . . . , Xn), Xi € (1 < * < u) : 

vector X over GF(2'")” and element Xi of X over GF(2™). 

AX, FY : difference of X and mask value of Y, respectively. 

X ■ rX : parity of bitwise product X and FX. 

X (BY : bitwise exclusive-OR (XOR) . 

X\Y : concatenation between X and Y. 

{S'}, #{S| : elements in set S and the number of elements in set S. 

2.2 Model 

Throughout this paper we consider Feistel ciphers with mn-hit SPN round func- 
tion (See Fig. 1). Note that we neglect the effect of the round key hereafter 

^ Kanda et al. showed that the upper bounds of these probabilities for 3r-round Feistel 
ciphers are if the round function is bijective [11] 
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Fig. 1. SPN round function 



since we assume that the round key, which is used within one round, consists of 
independent and uniformly random bits, and is bitwise XORed with data. 
Notations describe the model as below. 

S'-function is a non-linear transformation layer with n parallel m-bit bijective 
s-boxes. That is, 

S : (^™)" 

JC (xr , . . . , Xji') I y Z (sr(xr),...,Syi (x^)) 

P-function is a linear transformation layer, i.e., 

P : ^ 

Z= {zi,...,Zn) I — >Y = P{Z) = (yi,...,y„) 

Finally, the SPN round function can be described as follows. 

F : {ZZ'^Y (^™)" 

X = (xi, . . . , x„) ^ y = F{X) = P{S{X)) = ( 2 / 1 , . . . , 2 /„) 

Let be the input data to the Tth round function, and be the i-round 
output data. The Feistel cipher is defined as: 

^ ^ y(i) (1 < j < 

where is a plaintext and is a ciphertext. 



2.3 Definitions 

We use the following definitions in this paper. 

Definition 1. For any given Ax, Az, Fx, Fz € ZZ"^ , the differential and linear 
probabilities of each s-box Si are defined as: 

#{x G ^^|si(x) 0 Si(x 0 Ax) = Az} 



DP^'{Ax Az) = 
LP‘^'{Fz -)■ Px) = ( 2 X 



#{xG^^|x-Px = s,(x)-Pz} 



- 1 
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Definition 2. The maximum differential and linear probahilities of s-boxes are 
defined as: 

Ps = max max DP’^'iAx — >■ Az) 

I Ax^O.Az 

o„ = max max LP‘^'(rz^rx) 

I rx,rz^o 

This means that ps, are the upper bounds of the maximum differential and 
linear probabilities for all s-boxes. 

Definition 3. A differential active s-box is defined as an s-box given a non-zero 
input difference, while a linear active s-box is defined as an s-box given a non- 
zero output mask value [11]. 

Note: When an s-box is bijective, s-boxes given a non-zero output difference 
and a non-zero input mask value are also differential and linear active s-boxes, 
respectively. 

Definition 4. Let X = {x\, . . . ,Xn) G GF(2™)” then the Hamming weight of 
X is denoted by 

Hw{X) = ff{i\Xi 0 }. 

This means that the Hamming weight of X equals the number of non- zero m-bit 
characters from GF(2"*) of X . 

3 Previous Works — the Security of SPN Ciphers 

As mentioned above, the security of most SPN ciphers against differential and 
linear cryptanalyses is evaluated using the (lower bound of) minimum number of 
differential and linear active s-boxes, which are a measure of the upper bounds 
of differential and linear characteristic probabilities [19,8,5]. To determine the 
(lower bound of) minimum number of active s-boxes, Rijmen et al. defined the 
branch number B [19]. 

Definition 5. In SPN ciphers, the differential branch number Bd is defined as: 

Bd = min {Hy,{AX) + i/^(0(Z\A))), 

AX^O 

where AX is an input difference into the diffusion layer and 9{AX) is an output 
difference from the layer. 

Note that AX is also an output difference from a substitution layer and 
6{AX) is also an input difference to the next substitution layer. Since s-boxes 
are bijective, Hy„{AX) equals the number of differential active s-boxes in the 
substitution layer and Hu,{0{AX)) equals that in the next substitution layer. 
That is, if Ud is the minimum number of differential active s-boxes in two con- 
secutive rounds, then Ud = Bd. Thus, it turns out that the minimum number 
of differential active s-boxes in 2r-round SPN ciphers is lower bounded by rBd, 
and the following theorem is obtained. 
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Theorem 1. The maximum differential characteristic probability for 2r-round 
SPN cipher, p^^^\ is upper bounded by 

From the duality between differential characteristics and linear approxima- 
tions [4,14], the following definition and theorem also are established. 

Definition 6. The linear branch number Bi is defined as: 

Bi = min {H,,{e*{rY)) + iF^(TT)), 
ry/o 

where TY is an output mask value of the diffusion layer 6 and 9*{TY) is an input 
mask value of the layer. 9* is the diffusion function of mask values concerning 
the layer. 

Theorem 2. The maximum linear characteristic probability for 2r-round SPN 
cipher, is upper bounded by 

4 Upper Bound of Differential Characteristic Probability 

In this section, we investigate the upper bound of differential characteristic prob- 
ability of Feistel cipher with SPN round function. In the same way as in the 
previous section, our goal is to clarify the (lower bound of) minimum number of 
differential active s-boxes in some consecutive rounds of Feistel cipher. 

First, we show the useful lemma concerning the hamming weight for Feistel 
ciphers. 

Lemma 1. In Feistel ciphers, the following relationship holds. 

H^{AY^^) = 0 AX^^+^'i) < H^{AX^^~^'^) + 

Proof. 

H„,{AY^^^) = H^{AX^^-^'> 0 AX^^+^'>) 

= 0 and = 0} 

+ff{t\Ax^f~^'^ = 0 and Ax^f^^'^ 0} 

+ff{u\Ax‘i~^^ 0 and Ax^,f+^^ 0 and x<f~^'> 

< 0 = 0 and Axf^^'^ 0} 

Q.E.D. 

Since there is a linear transformation layer (P-function) in the SPN round 
function, we will define the differential branch number Vu in the same way as 
in the previous section. Note that it is obvious that if S'-function is bijective 
then Hyj{AX) = Hu,{AZ), since Azi also becomes a non-zero output difference 
through the differential active s^-box. 
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Definition 7. If S -function is bijective, the differential branch number Vd is 
defined as follows. 

Vd = min {H^{AX) + H^{AY)) 

AX^O 

Here, we will define the upper bound of the maximum differential character- 
istic probability of Feistel cipher with SPN round function in the same way as 
used for the SPN cipher. That is, the upper bound of the probability is shown 
by the (lower bound of) minimum number of differential active s-boxes. 

Definition 8. Assume Feistel cipher with SPN round function. Let Hyj{AX^'^'>) 
be the number of the ith-round differential active s-boxes, then the differential 

(r) 

characteristic probability of the r -round Feistel cipher, p)^ , satisfies the following 
relationship. 

(r) , ax(’’+i)) 7 S(o,o,...) Si=i 

Pd — 

From this definition, clarifying the upper bound of the maximum differential 
characteristic probability becomes equivalent to showing the (lower bound of) 
minimum number of differential active s-boxes. To discuss the minimum number 
easily after this, it is denoted as follows. 

r 

= min VH„(Z\XW) 

(AX(o),AX(A^...,AX(’-+i))^(0,0,...) ^ 

Hereafter, because of limitations of space, we assume P-function is bijective. 
Note that this leads to Vd A 2. 

Lemma 2. The minimum number of differential active s-boxes in any three 
consecutive rounds satisfies > 2. 

Proof. If AX^'^'> = 0, then AY^''^ = 0 and AX^'^~^'> = 0. This leads to 

= 2 X H.uj{AX^^~^'i) > 2. On the other hand. If AX^A ^ q, it follows that 
2^2 — Hyj{AX^'^l) H- Pu,(Z\y*^®^) > Vd, since Lemma 1 shows Hu,{AX^'‘~A') -\- 

P^(ziX(*+i)) > P^(Z\F«). 

Q.E.D. 

Lemma 3. The minimum number of differential active s-boxes in any four con- 
secutive rounds satisfies > Vd. 

Proof. Without loss of generality, we assume that the four consecutive rounds 
run from the first round to the fourth round. 

At no time do both input differences into any consecutive two rounds equal 
zero. In addition, by the assumption, at no time also do both input differences 
of every two rounds equal zero. Thus we only consider the six following cases 
concerning input differences into the consecutive four rounds. 

( 1 ) AXA) ^ 0 , AX<^A ^ 0 , 0 , AXA) ^ 0 
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(2) = 0, ziX(2) ^ 0, ^ 0, ^ 0 

(3) 0, ziX(2) = 0, Z\X(3) ^ 0, zixW ^ 0 

(4) 0, ziX(2) ^ 0, Z\X(3) = 0, zixW ^ 0 

(5) 7^ 0, ziX(2) 7 ^ 0, Z\X(3) ^ 0, zixW = 0 

(6) = 0, ziX(2) 7^ 0, Z\X(3) 7^ 0, zixW = 0 

In case (1), by Lemma 2, = vf> + H^{AX^^">) >Vd + H^{AX'^‘^'>) > 

T’d + ^- 

In case (2), AX^^'^ = 0 leads to ^ ^^(3)_ Thus, v''^> = iJ„(Z\X(2)) + 

H^{AY(^^) + iL^(Z\xW) >Vd + H^{AXi^'>) >Vd + l- 
Similarly, in cases (3), (4), and (5), we get > 'Pd + 1- 
In case (6), by Lemma 2, > Vd- 

Q.E.D. 

From the above proof, the following corollary is obtained. 

Corollary 1. The minimum number of differential active s-boxes in any four 
consecutive rounds satisfies 

(i) > Vd, if and only if the input differences in both the first round and the 

fourth round are zero. 

(li) r>(4) > 77 ^ + 1 in the other cases. 

Lemma 4. The minimum number of differential active s-boxes in any six con- 
secutive rounds satisfies V^^'> > Pd + 2. 

Proof. — If AX^^'> fi- 0 and AX^^^> yf 0, by Lemma 2, + P® > 

2 X Vd- 

- If ziX(2) = ziX(5) = 0, we get ZCfP) = AX^^^ and ZiF^^) ^ ^ 

Z\X( 6 ). Thus, = 2 X (iL^(ZFf(3)) + i7^(Z\XW)) = 2 x {H^{AX^^'>) + 
iJ„(ziF(3))) > 2 X Pd 

~ If Z\X(2) = 0 and Z\X(5) 7 ^ 0, or AX^'^1 yf 0 and AX^^'’ = 0, then vf'’ = 
P® + > Pd + 2 by Lemma 2. 

Q.E.D. 

Lemma 5. The minimum number of differential active s-boxes in any eight 
consecutive rounds satisfies > 2 x Pd + I. 

Proof. Again, corollary 1 shows that, in any four consecutive rounds, the mini- 
mum number of differential active s-boxes satisfies (i) > Vd, if and only if 

the input differences in both the first round and the fourth round are zero, and 
T>(4) > Pjj -|- 1 in the other cases. 

Since there is no case in which both input differences into any two consecutive 
rounds are zero at the same time, the input differences in both the fourth and 
fifth rounds cannot be zero. That is, the eight consecutive rounds cannot be 
divided into two cases (i). Thus, P*^®) > Pd + (Pd + 1) > 2 x Pd -I- 1. 



Q.E.D. 
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Lemma 6. The minimum number of differential aetive s-boxes in any twelve 
eonsecutive rounds satisfies > 3 x + 1- 

Proof. can be converted to three expressions, i.e., 4 x T>^^\ 2 x T>^^\ and 

2?(8) _|_2?(4)^ Since satisfies the three evaluations at the same time, = 

max{4 X V^^\2 x p(s) p(4)| > ^(s) > 3 x 2^^ + 1. 

Q.E.D. 

From the proofs of above-mentioned lemmas, the useful theorem for the dr- 
round Feistel ciphers is established as follows. 

Theorem 3. The minimum number of differential active s-boxes for 4r- 

round Feistel ciphers with SPN round function satisfies >rxVd+\r/2\. 

Knudsen argued that for a Feistel cipher to be practically secure against dif- 
ferential and linear cryptanalyses, the upper bounds of the maximum differential 
and linear characteristic probabilities must be less than the security threshold. 
Generally speaking, the security threshold is equated to the inverse of the num- 
ber of all plaintext blocks, i.e., 2“®^ for 64-bit ciphers and 2“^^® for 128-bit 
ciphers. 

For example, let the maximum differential probability of an 8-bit s-box be 
Ps = 2“® and the differential branch number be Vd = 5. It follows that 18-round 
Feistel ciphers, such as Camellia [1], are practically secure against differential 
cryptanalysis because of the following corollary. 

Corollary 2. Assuming that the round function consists of s-boxes yielding the 
maximum differential probability ps = 2“® and P-function yielding the differ- 
ential branch number Vd = 5, then a 128-bit Feistel cipher with more than 
16-rounds has no effective differential characteristic. 

Proof. By Definition 8 and Theorem 3, = 2“^®^ < 2“^^®. 

Q.E.D. 



5 Upper Bound of Linear Characteristic Probability 

In this section, the upper bound of linear characteristic probability is derived 
in the same way as in the previous section. That is, our goal is to clarify the 
(lower bound of) minimum number of linear active s-boxes in some consecutive 
rounds of Feistel cipher using the duality of differential characteristic and linear 
approximation. 

First, the following theorem is established. 

Theorem 4. Consider a Feistel cipher with SPN round function. If the linear 
transformation layer P (P-function) is bijective, the cipher can be transformed 
into a Feistel cipher with the PSN round function. 
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Proof. From the assumption that P-function is bijective, let describe P{Z) as 
the transformation of Z by the P-function, and P~^{Z) as that by the inverse 
function of the P-function. 

As mentioned above, in a Feistel cipher with SPN round function, the equa- 
tion, ©P(S'(A(*))), is satisfied. Now, let = p-i(X(*)). The 

above equation can be transformed as follows, since C = A (B P{B) C = 
P(P“^(A) © B) for any (A, P, C). 

^ ^ p(5'(xW)) = p(p-i(x(*-i)) © s'(aW)) 

^ P(yb+ 1 )) = P(yb-i) 0 S’(p(pW))) 

^ y{i+i) = y{i-i) 0 S{P{V^^)) 

The equation, 0 S{P{V^''^)), denotes a Feistel cipher with 

the PSN round function. Accordingly, the ciphertext obtained by 

applying a Feistel cipher with SPN round function to a plaintext 
is equivalent to the result of changing the plaintext to 

by the P“^-function first, then getting t/(r-i-i)) from from the 

Feistel cipher with PSN round function, and finally transforming it into the 
ciphertext by the P-function. 

Q.E.D. 

Starting with the duality between differential characteristic and linear ap- 
proximation, we will define the linear branch number P/, which is similar to the 
differential branch number Vd- Hereafter, we assume P-function is bijective. 

Definition 9. The linear branch number Vi is defined as: 

Vi= min(P^(P*(Pr))+P^(PF))= min (P^(PZ) + P^(Py)), 

where PY, PZ is an output mask value and an input mask value of the P- 
function, respectively, and P* is a diffusion function of mask values concerning 
the P-function. 

Next, we will define the upper bound of the linear characteristic probability 
of a Feistel cipher with SPN round function. That is, the upper bound of the 
probability is shown by the (lower bound of) minimum number of linear active 
s-boxes. 

Definition 10. Assume a Feistel cipher with SPN round function. If P[,„{PZ^^^) 
is the number of the ith-round linear active s-boxes, then the linear characteristic 
probability of the r-round Feistel cipher satisfies the following relationship. 

(r) , ™“(rv(o).....rvM,rv('-+i))7S(...,o,o) SLi 

Pi 2^ Ps , 

where PZ^^ = P*(py(d) and P* is the diffusion function of mask values con- 
cerning the P-function. 
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From this definition, clarifying the upper bound of the linear characteristic 
probability becomes equivalent to determining the (lower bound of) minimum 
number of linear active s-boxes. To discuss the minimum number easily after 
this, we denote it as follows. 

r 

min 

0.0) ^ 

Theorem 5. Assume a Feistel cipher with SPN round function. If both S- 
function and P-function are bijective, then and Vi also satisfy Lemma 2 
to Lemma 6 and Theorem 3. 

Proof. Because of the bijective P-function, a Feistel cipher with SPN round 
function is transformed into one with PSN round function by Theorem 4. The 
cipher can be described as: 

pb+i) = yb-i) © 5(P(PW)) = 0 

where = p-i(X«), ZW = S'(xW). 

From the duality between differential characteristic and linear approximation, 
the linear approximation of the round function of the transformed cipher can be 
expressed as follows using the concatenation rules [4,14]. 

pp(*) = © PZ(*+^) = P*{PX^^'>) 

By the way, since S'-function is bijective, H.uj{rX) = Hyj{rZ) because Pxi 
is a non-zero input mask value of a linear active s^-box. Therefore, the linear 
branch number P; is redefined as: 

n = min {H^{P*{rX)) + H^{PX)) = min {H^{PV) + H^{PZ)) 
rx^o rz^o 

Accordingly, if AX^'^'> and are exchanged for PZ^'^'> and PP(*\ respec- 

tively, it turns out that all proofs are satisfied in the same way as for Lemma 2 
to Lemma 6 and Theorem 3. 

Q.E.D. 



For example, the P*-function of Camellia can be expressed as: 
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Thus, it is easily seen that Vi = 5, and the following corollary is obtained. 
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Corollary 3. Camellia with reduced to 16-rounds (without FL- and FL 
functions) has no effective linear approximation. 

Proof. The maximum linear probability of Camellia’s s-boxes is Qs = 2“®. From 
Theorem 5 and Vi = 5, the maximum linear characteristic probability of Camel- 
lia with reduced to 16-rounds is also upper bounded by 

Q.E.D. 



6 Necessary Condition for Desirable P-Functions 



In this section, we consider the necessary condition for desirable P-functions. 
Here, “desirable” means that the round functions are invulnerable to linear 
cryptanalysis as well as differential cryptanalysis. 

Obviously, the condition is Pd = Pi from Sect. 4 and Sect. 5. Thus, we 
investigate P-functions wherein Pd = Pi- 



Theorem 6. Assume that P -function is bijective and is expressed as an n x 
n matrix P over GF(2)’". When the P-function satisfies [piY = the 

following relations are satisfied. 



[^ViY = > [rziY = [p^jYiryjf = [pji][ryjY, 



where [xi\ denotes the vector (or matrix) of X and [xif denotes the transposed 
vector (or matrix) of X . 

Proof First, since pi = • Zj), 



^y^ = yi®y'i= • zfi 0 0(py • z)) 









= 0(Pb • ® Pij ■ 



i=i 



= 0(Pb • ® Zj)) = 0(Pb ■ ^Zj) 

i=i j=i 

Thus, [Apif = [pij][AzjY is satisfied. 

Second, since the P-function is bijective, Z ■ FZ = Y ■ FY. Then, 

n / / n \ \ n / n N 

y . PF = 0 0(pji • = 0 0(Pi* • • ryj) 

j^l \ \i=l / / \i=l / 



n / n 



0 0((Pii ■^yj)-zi)] =0 0(pp ■ ^vj) 



■ Zi 



i=l \i=l 



\ \i = l 
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On the other hand, since Z ■ FZ = 0”^i {zi ■ Fzi), it is obvious that F Zi = 
0j=ife* ■ ryj) = • Fyj). Thus, [F z^f = [pj^][FyjY = is 

satisfied. 

Q.E.D. 



Theorem 7. Assume that F is a set of matrices that consist of only one F 
element and (m— 1) 0-elements in each line and row, i.e., the matrices generated 
by only interchanging lines and/or rows of unit matrix. 

If a bijective P -function can be expressed as an nx n matrix P over GF(2)™ 
such that ■ P G F or P* = I 2 ■ P ■ I\ where I\,l 2 G I', then the P-function 
satisfies Vd = Pi- 

Proof. By Theorem 6, if the P-function can be expressed as an n x n matrix P 
over GF(2)™, then 

Pd = min (H^iAZ) + H,,{P{AZ))), Pi = min {H,,{P\FY)) + H,,{PY)) 



(i) In the case of P* • p = /* g let FY = P{FW). 

Since the P-function is bijective, it is guaranteed that {PT} = {PIT}. Thus, 
Pi = min (i/^(P‘(P(PIT))) + P™(P(PIT))) 

= min • PIT)) + i7^(P(PIT))). 



Here, because I* G F , I* ■ FW leads to another vector simply by interchanging 
the elements of PIT. Thus, i7u,(PIT) = ■ FW). As a result, 3{AZ,FW), 

s.t. Pd = Pl. 

(ii) In the case of P* = l2-P-Ii where Ii,l 2 G as mentioned above, since I\ 
and I 2 lead to another vector simply by interchanging the elements, Hu,{AX) = 
H„,{h{AX)) and i7^(/2(PIT)) = i7^(PIT). Now, let Z\Z = h{AX). Since h 
is bijective, it is guaranteed that {AX} = {AZ}. Thus, 



Pd 



On the other hand. 



min 

AX^O 



(P^(/i(ziA)) + P^(P./i(AA))) 



min 

AX^O 



(P^(Z\X)+i7^(P-/i(ziX))). 



Pi = mm (i7^(/2 • P • h(FY)) + H^rY)) 

= mm • Ii(PY)) + i7„(PT)). 

As a result, B(AX, FY), s.t. Pd = Pi. 

Q.F.D. 

For example, the relationship between P-function and P*-function of Gamel- 
lia is shown as follows. Thus Theorem 7 indicates that the P-function of Gamellia 
is “desirable.” 



^Camellia ^Camellia ^ ’ ^Camellia ' I 
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7 Conclusion 

This paper studied the upper bounds of the maximum differential and linear 
characteristic probabilities of Feistel ciphers with SPN round function. In the 
same way as for SPN ciphers, we considered the minimum number of differ- 
ential and linear active s-boxes, which are a measure of the upper bounds of 
these probabilities, in order to evaluate security against differential and linear 
cryptanalyses. The advantage of this method is that it considers the interre- 
lation between input and output data in consecutive rounds, unlike Knudsen’s 
estimation. 

We focused on the minimum number of active s-boxes in some consecutive 
rounds of Feistel ciphers, i.e., in three, four, six, eight, and twelve consecutive 
rounds, since they can determine the upper bounds of the maximum differential 
and linear probabilities using the differential and linear branch numbers Vd, 'Pi-, 
respectively. These numbers provide the avalanche effects of P-functions with 
regard to differential and linear characteristics. As a result, we clarified that 
the lower bounds of the minimum number of differential (resp. linear) active 
s-boxes are 2, Pd (7*;), Pd + 2 {Pi 2) , 2Pd 1 {2Pi -1-1), and 3Pd -I- 1 (3P/ + l)i 
respectively. The interesting result is that the lower bound of the minimum 
number of active s-boxes is proportional to the branch number every fourth 
round, while it seems to be every third round at first glance. Furthermore, this 
means that, if the branch number is the same, a 2r-round Feistel cipher has 
almost same invulnerability to differential and linear cryptanalyses as a r-round 
SPN cipher in terms of the upper bounds of the maximum differential and linear 
probabilities. 

Finally, we investigated the necessary condition for desirable P-functions, 
which means that the round functions are invulnerable to both differential and 
linear cryptanalyses. In addition, we showed the example of the round function 
of Camellia, which satisfies the condition. 
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Abstract. The block cipher GOST was proposed in former Soviet Union 
in 1989. In this paper we present the first result of differential cryptanal- 
ysis of GOST with reduced number of rounds. By introducing the idea 
of using a set of differential characteristics, which is a partitioning type, 
we can reduce the influence of the key value upon the probability as well 
as get high differential probability. Using 2®^ chosen plaintexts the key of 
13-round GOST can be obtained. Next this differential cryptanalysis is 
expanded with combining related-key attack. Using 2®® chosen plaintexts 
the key of 21 rounds of GOST can be obtained. 



1 Introduction 

The block cipher GOST was proposed in former Soviet Union in 1989 [1]. GOST 
is an acronym for “Gosudarstvennyi Standard”, or Government Standard. 

In this paper we present the first result of differential cryptanalysis of GOST 
with reduced number of rounds. Next the analysis is expanded with combining 
related-key attack. 

GOST has key addition modulo 2^^ in each round function. So orthodox 
differential cryptanalysis using one characteristic is not useful. The reason is 
that the probability of differential characteristic varies with not only the value of 
input-output difference but the value of the sub-key, frequently become zero. To 
overcome this we introduce the idea of using a set of differential characteristics. 
This is similar to truncated differential attack [2, 3, 4] in predicting only parts of 
all output bit value. But it is slightly different in the sense that this attack uses 
a set of differentials of S-boxes and applies this to round function, which is a 
partitioning type, and construct a new type of 2-round iterative characteristic. 

By this characteristic we can reduce the influence of the key value upon the 
probability as well as get high differential probability. On average using 2®^ 
chosen plaintexts the key of 13-round GOST can be obtained. In the case of keys 
which make the probability the highest, 17 rounds of GOST can be attacked. 

This differential cryptanalysis is expanded with combining related- key attack [5] .| 
John Kelsey et al applied related- key attack to GOST [6]. But no concrete char- 
acteristics was revealed in [6]. In this paper we show the concrete characteristics. 
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Fig. 1. GOST round function 



On average using 2^® chosen plaintexts the key of 21-round GOST can be ob- 
tained. 

These attacks are also applicable, even if the S-boxes are randomly generated. 

This paper is organized as follows. Section 2 briefly reviews algorithm of 
GOST. In Section 3 we describe a differential cryptanalysis of GOST using one 
differential characteristic. In Section 4 we describe a differential cryptanalysis 
of GOST using a set of differential characteristics. In Section 5 we discuss the 
differential cryptanalysis with combining related-key attack. In Section 6 we 
discuss the differential cryptanalysis in the case of random S-boxes. We conclude 
in Section 7. 



2 Description of GOST 

The block cipher GOST is based on the framework of the Feistel cipher. GOST 
has 32 rounds, 64-bit blocksize, and 256-bit keysize. The F-function consists of 
operations specified as follows(see also Figure 1). 

-I- : Addition modulo 2^^ 

S-boxes : 8 different 4 x 4-bit S-boxes Si,S 2 ,--,Ss 
«< 11 : 11-bit left rotation 

The S-boxes are not specified in the standard. In this paper we use a set of 
S-boxes used in an application for the Gentral Bank of the Russian Federation 
(tables of S-boxes are given in page 333 of [7], see also Appendix A.). 

Key-schedule is simple. The 256-bit master-key is divided to eight 32-bit 
blocks : ki,k 2 ,--,kg,. Each round uses the subkey as shown in Table below. 

Round! 1 2 3 4 5 6 7 8 9 10 ^ 15 16 17 18 ^ 23 24 25 26 27 28 29 30 31 32 
key \ki k 2 ks ki ks ke kr kg ki k 2 ~ fc? kg ki k 2 ~ kr kg kg kr fee ^5 ^4 ^3 ^2 fci 
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round 1 

p =1.3.2’“ 



round ( i + 1 ) 
p = 1 . 3 * 2 "^^ 



round ( 1 + 2 ) 

P =1 

Fig. 2. One of the best 3-round iterative characteristic 

3 An Attack Using One Differential Characteristic 

GOST has key addition modulo 2^^ in each round. In such a cipher the differ- 
ential probability varies with not only the value of input-output difference but 
the value of the sub-key, and frequently become zero(see also Appendix B). Fig- 
ure 2 shows one of the best 3-round iterative characteristic in case of S-boxes 
used in an application for the Central Bank of the Russian Federation[7]^. This 
characteristic has the probability described below in one round. 

0 < Pro6{ 0500 1008,, ^ 05001008„} < 1.5 x 2”'^ (1) 

Where X — >■ Y means that an inputxor X result in an outputxor Y. Average 
probability over all key values is 1.3 x in one round. 8-round characteristic 
has probability 2“®^. So using 2-Round attack 10-round GOST is expected to 
be attacked using 2®® chosen plaintexts. But more than half of sub- key space 
makes the differential probability of each S-box to be zero(See also Appendix 
B). In 8-round characteristic the chance for the probability to be nonzero is only 
2 X 10“®. Consequently attack using one differential characteristic is not useful 
for GOST. 




4 Cryptanalysis of GOST Using a Set 
of Differential Characteristics 

To overcome the dependence of differential probability on the key, we introduce 
the idea of using a set of differential characteristics. This is slightly different 
from truncated differentials in the sense that this attack use a set of differentials 
of S-boxes and apply this to round function, which is a partitioning type, and 
construct a new type of 2-round iterative characteristics. By this characteristics 
we can considerably reduce the influence of the key value as well as get higher 
differential probability than described in Section 3. 

^ A 3-round iterative characteristic which has 2 active S-boxes in each round is im- 
possible 
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round 1 

P= 1 



round 2 
p=0-37 



round 8 

p=0. 43*0. 37*0. 37*0. 47 



round (n — 2 ) 

p=0. 38*0. 37*0. 35*0. 45 



round (n — 1 ) 
P= 1 



Cl Cr 

Fig. 3. A set of differential characteristics 



4.1 A Set of Differential Characteristics 

We use a set of differential characteristics as shown in Figure 3. The differences 
of plaintext pairs are (OOOOOttOOa; || 00000000^;). jj rneans nonzero 4-bit difference 
whose MSB(most significant bit) is zero. 

This set of differential characteristics are possible when LSB (least significant 
bit) of output difference of each active S-box is zero. The number of active S- 
boxes increases one by one with the number of rounds, and saturates with 4 
after round 8. 

At first we estimate the probability of differentials of each S-box. That is 
Prob{nonzero difference whose MSB is 0 — >■ nonzero difference whose LSB is 0}. 
The probability varies from 0.30 to 0.75 depending on S-box number and the 
key values(see Appendix C for details). Let ps- be average probability of Si for 
all key values. The average probability of each round is the products of ps. of 
active S-boxes. For example the average probability of round 8,10, and so on is 
PSi X PS 3 X PS 5 X psj = -43 X .37 x .37 x .47. 
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4.2 An Attack on GOST 

To recover the last round sub-key, we use the characteristics as shown in Figure 3. 
At first we have to fix all 32 bits of input difference to (n — l)-th F"-function to 
zero. To realize this a cancellation has to happen at an xor operation after the 
(n-2)-th F-function. If each value of jj is randomly distributed form 1 to 7, the 
probability of getting zero difference is (y)^. We estimate the probability Pn of 
this characteristics for n-round GOST. The probability is shown as follows when 
n is even^. 



?-l ^■-2 ^■-2 f-3 f-3 

Pn=p|3 xpl^ xpl^ 

4 

f-4 ^-4 f-5 ■ ' ' 

XPk X pi X pi X 



S/N-ratio is defined as follows [8]. 

S/N=^-^. 

ax (3 

k = number of key bits we are looking for 
p = probability of characteristics 
a = average count of keys per analyzed pair 
(3 = ratio of analyzed pairs to all pairs 

In this 2-Round attack each value is described as follows^. 

A: = 32, a=l, P = 



(2) 



Consequently we get 



S/N = pr,x2^l (3) 

Table 1 shows the estimated values of p, S/N, and the number of chosen plain- 
texts needed. If we choose a structure of 2^ plaintexts which differ only at 3 bits 
of PL^i in Figure 3), one structure proposes 28 pairs of plaintexts. For example 
2^® plaintexts propose about 2^^ pairs. In the case of the keys which make the 
probability of differential characteristics the highest, pn equals 1.6 x 2“^® and 
17-round GOST can be attacked. 

5 A Related-Key Attack 

A related- key attack were first described in [5]. John Kelsey et al. proposed 
related-key attack of GOST [6]. But no concrete characteristics was revealed. In 

^ When n is odd the probability is shown in a similar way 
20 bits of Cr are fixed to zero, so /3 = All bits of Cl © Fn{CR) are fixed to 

zero, so a = 1 
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Table 1. Estimates of pairs needed for differential attack 



Rounds 


Prob. 


S/N 


Chosen Plaintexts 


12 


1.2 X 2-'^^ 


2« 


24® 


13 


1.7 X 2"®° 


7 


2®i 


14 


1.5 X 2"®® 


0.4 


impossible 



Table 2. Estimates of pairs needed for related-key attack 



Rounds 


Prob. 


S/N 


Chosen Plaintexts 


20 


1.8 X 2"^^ 


1.8 X 2® 


249 


21 


1.3 X 2-®^ 


1.3 


2®6 


22 


1.1 X 2"®^ 


2"® 


impossible 



this section the differential cryptanalysis mentioned in Section 4.2 is expanded 
with combining related-key attack, and the concrete characteristics are shown. 
Two unknown related keys K and K* are used for attack. The relationship 
between two keys are described as follows. 



K = (ki,k2,--,ks) 

K* = (fci © 80000000,,, fc2, fcg) 

Using plaintext P = for key K and P* = {Pl © OOOOOTOO,, Pr) for 

key K*, we can bypass the first 8 rounds for free with probability | Figure 4 
shows the differential characteristics using related-key attack. 

In round 9 output difference is OOOOOjiOOa; with probability |. After round 
10 the differential characteristics are the same as described in Section 4.2. Con- 
sequently the probability of the differential characteristics of n-roimd GOST is 
I X I X Pn-8- Where Pn-s is calculated from equation (2). 

Table 2 shows the estimated values of p, S/N, and the number of chosen 
plaintexts needed for attack. 

6 In the Case of Random S-Boxes 

The S-boxes are not specified in GOST. In this section we discuss the attack 
in the case of random S-boxes. We have generated 100,000 of random S-boxes, 
and obtained the Pro6{nonzero difference whose MSB is 0 — >■ nonzero difference 
whose LSB is 0}. Table 3 shows the probability, corresponding ratio of the num- 
ber of S-boxes, and the number of rounds we can attack. On average 12 rounds 
of GOST can be attacked with a set of differential characteristics. 

Next we consider the case of the analysis with combining related-key attack. 
The best characteristic in round 1 to bypass the first 8 rounds for free varies from 

(ki © 80000000a;) + Pr = k\ + (Pr © 80000000a,). So this probability is equal to 
Pro&{8a: ^ 6a;}. Foi all vulues of k\ this probability of Sg is | 
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Pl 



Pr 



Pl* (=Pl® 00000700.) 



Pr 




Fig. 4. The differential characteristics with related key 
Table 3. Estimates in the case of random S-boxes 



Prob 


0.34< 


0.38< 


0.42< 


0.47< 


0.50< 


0.54< 


0.625 


Ratio(%) 


62 


27 


8 


2.3 


0.5 


0.4 




Rounds (differential) 


12 


13 


14 


15 


16 


17 ~ 20 


20 


Rounds (related-key) 


19 


20 


21 


22 


23 


24 - 26 


27 



the S-box construction. But we can always find the characteristic which has the 
probability larger than So we use this probability for every S-boxes here. On 
average 19 rounds of GOST can be attacked with combining related-key attack. 

The maximum probability of all random S-boxes® is 0.625. In this case 20 
rounds of GOST can be attacked using a set of differential characteristics, and 
27 rounds of GOST can be attacked with combining related-key attack. 

Gonsequently this set of differential characteristics is useful even if S-boxes 
are randomly generated. 

7 Conclusion 

In this paper we described the first result of an attack on GOST with reduced 
number of rounds using a set of differential characteristics, which is a partitioning 

The S-box which has the maximum probability is 
{9,7,5,1,11,15,3,13,0,4,12,10,14,8,2,6} 
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type. In the case of S-boxes used in an application for the Central Bank of the 
Russian Federation, on average using 2®^ chosen plaintexts the key of 13-round 
GOST can be obtained. In the case of the keys which make the probability of 
differential characteristics the highest, 17-round GOST can be attacked. The 
analysis is also expanded with combining related-key attack. Using 2®® chosen 
plaintexts the key of 21-round GOST can be obtained. 

We also show this attack is applicable, even if the S-boxes are randomly gen- 
erated. On average 12 rounds of GOST can be attacked with a set of differential 
characteristics, and 19 rounds of GOST can be attacked with combining related- 
key attack. In the case of the weakest S-box 20 rounds of GOST can be attacked 
with a set of differential characteristics, and 27 rounds of GOST can be attacked 
with combining related-key attack. 



Appendix A: S-Boxes 

A set of S-boxes used in an application for the Gentral Bank of the Russian 
Federation are given in page 333 of [7]. 

Ss = {1,15,13,0,5,7,10,4,9,2,3,14,6,11,8,12} 

Sr = {13,11,4,1,3,15,5,9,0,10,14,7,6,8,2,12} 

Se = {4,11,10,0,7,2,1,13,3,6,8,5,9,12,15,14} 

S's = {6,12,7,1,5,15,13,8,4,10,9,14,0,3,11,2} 

S '4 = {7,13,10,1,0,8,9,15,14,4,6,12,11,2,5,3} 

S 3 = {5,8,1,13,10,3,4,2,14,15,12,7,6,0,9,11} 

S 2 = {14,11,4,12,6,13,15,10,2,3,8,1,0,7,5,9} 

Si = {4,10,9,2,13,8,0,14,6,11,1,12,7,15,5,3} 

Appendix B: An Example of Dependence 
of Differential Probability on the Key 

The table below shows the Prob{8x — >■ 2^} of Si, Prob{lx — >■ ax} of S' 4 , 
Prob{5x — >■ lx} of Sr used in an application for the Gentral Bank of the Russian 
Federation. The key values in the table are 4 bits of the round-key whose bit 
positions are corresponding to each S-box. We don’t count the case in which 
differential carry bit occurs to the fourth position, because carry bit doesn’t 
hold the input difference of upper S-box to be zero. This table shows that the 
probability becomes zero for more than half of the key space. 



key 


0 


1 


2 


3 


4 


5 


6 


7 


8 


9 


a 


b 


c 


d 


e 


f 


^7 


0 


.13 


0 


.25 


0 


.13 


0 


0 


0 


.13 


0 


.25 


0 


.13 


0 


0 


^4 


.38 


0 


.38 


0 


.38 


0 


00 

CO 


0 


.38 


0 


.38 


0 


.38 


0 


.38 


0 




.13 


0 


0 


0 


0 


0 


0 


0 


0 


.13 


.13 


.13 


.13 


.13 


.13 


.13 
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Appendix C: Differential Distribution Table of Each S-Box 



This table shows Pro6{nonzero difference whose MSB is 0 — ?► nonzero difference 
whose LSB is 0} of each S-box used in an application for the Central Bank of 
the Russian Federation. We don’t count the case in which differential carry bit 
to the fourth position occurs, because carry bit doesn’t hold the input difference 
of upper S-box to be zero. 



key 


0 


1 


2 


3 


4 


5 


6 


7 


8 


9 


a 


b 


c 


d 


e 


f 


average ps- 


Ss 


.46 


.46 


.43 


.43 


.43 


.46 


.46 


.46 


.46 


.46 


.43 


.43 


.43 


.46 


.46 


.46 


.45 


Sr 


.75 


.55 


.43 


.36 


.39 


.38 


.43 


.55 


.75 


.55 


.43 


.36 


.39 


.38 


.43 


.55 


.47 


Se 


.43 


.39 


.32 


.30 


.32 


.32 


.36 


.39 


.43 


.39 


.32 


.30 


.32 


.32 


.36 


.39 


.35 


Ss 


.46 


.39 


.36 


.32 


32 


.32 


.36 


.43 


.46 


.39 


.36 


.32 


32 


.32 


.36 


.43 


.37 


S4 


.46 


.38 


.36 


.32 


.39 


.32 


.36 


.39 


.46 


.38 


.36 


.32 


.39 


.32 


.36 


.39 


.37 


S3 


.43 


.39 


.32 


.32 


.32 


.32 


.43 


.39 


.43 


.39 


.32 


.32 


.32 


.32 


.43 


.39 


.37 


S2 


.46 


.43 


.36 


.36 


.32 


.38 


.36 


.38 


.46 


.43 


.36 


.36 


.32 


.38 


.36 


.38 


.38 


Si 


.57 


.55 


.43 


.36 


.39 


.36 


.36 


.43 


.57 


.55 


.43 


.36 


.39 


.36 


.36 


.43 


.43 
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Abstract. This paper studies the upper bounds of the maximum differ- 
ential and linear characteristic probabilities of Feistel ciphers with SPN 
round function. In the same way as for SPN ciphers, we consider the 
minimum number of differential and linear active s-boxes, which pro- 
vides a measure of the upper bounds of these probabilities, in order to 
evaluate the security against differential and linear cryptanalyses. The 
purpose of this work is to clarify the (lower bound of) minimum numbers 
of active s-boxes in some consecutive rounds of Feistel ciphers, i.e., in 
three, four, six, eight, and twelve consecutive rounds, using differential 
and linear branch numbers Vd, "Pi, respectively. Furthermore, we inves- 
tigate the necessary condition for desirable P-functions, which means 
that the round functions are invulnerable to both differential and linear 
cryptanalyses. As an example, we show the round function of Camellia, 
which satisfies the condition. 



1 Introduction and Motivation 

The best known attacks are differential cryptanalysis [6] proposed by Biham and 
Shamir and linear cryptanalysis [13] proposed by Matsui. Since these cryptanal- 
yses are the most powerful approaches known for attacking many symmetric 
block ciphers, designers should evaluate the security of any new proposed ci- 
phers against differential and linear cryptanalyses. To do this it is necessary to 
determine the maximum differential and linear probabilities by a useful (and 
acceptable) method. Feistel ciphers are commonly analyzed by (a) the upper 
bounds of the maximum average of differential and linear hull probabilities or 
(b) the maximum differential and linear characteristic probabilities. SPN ciphers, 
on the other hand, are commonly analyzed by (c) the upper bounds of the max- 
imum differential and linear characteristic probabilities. Recently, Hong et al. 
showed (a) the upper bounds of the maximum average of differential and linear 
hull probabilities of SPN ciphers [9] . 

With reference to method (a), Nyberg and Knudsen showed that the max- 
imum average of differential and linear hull probabilities for r-round (r > 4) 
Feistel ciphers are bounded by 2p^, 2q^ if the maximum differential and linear 
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probabilities of the round function are p, q, respectively^ [18]. They stated that 
Feistel ciphers are provably secure against differential and linear cryptanalyses 
if these probabilities are sufficiently low. This means that they are theoretically 
invulnerable to differential and linear cryptanalyses, since these probabilities 
are the upper bounds of the average of differential and linear hull probabilities. 
However, this approach has one fatal disadvantage. That is, these probabilities 
settle at some constant value even if the number of rounds increases. Therefore, 
a round function has to yield extremely low maximum differential and linear 
probabilities. This imposes a hard restriction on designing the round function. 
As a matter of fact, for a commercial cipher, MISTY [15] is provably secure with 
respect to differential and linear cryptanalyses. 

Method (b) has been used to estimate many (extended) Feistel ciphers such 
as DES [6,13] and FEAL [16,2]. Biham and Shamir claimed that the higher the 
differential characteristic probability is, the higher the success rate of differen- 
tial cryptanalysis is. This is because they exploited a single path between plain- 
texts and ciphertexts which holds significant differential characteristic probabil- 
ity. Matsui also claimed the same for linear cryptanalysis. Thus, Feistel ciphers 
are sujficiently secure against differential and linear cryptanalyses if these prob- 
abilities are less than the security threshold. Strictly speaking, however, these 
probabilities only give the lower bounds of the maximum average of differential 
and linear hull probabilities, since this method does not consider multiple paths 
between the same plaintexts and ciphertexts [12,17]. 

For SPN ciphers, Rijmen et al. introduced the branch number B [19]. The 
number B is the minimum number of active s-boxes in two consecutive rounds 
of a non-trivial differential characteristic or a non-trivial linear trail. Since each 
active s-box reduces the differential and linear characteristic probabilities, the 
number B provides the upper bounds of the maximum differential and linear 
characteristic probabilities in two consecutive rounds. The security against dif- 
ferential and linear cryptanalyses is evaluated by piling up the number B every 
two rounds. It is noted that Knudsen proposed a very similar concept for Feistel 
ciphers [10]. He noted that Feistel ciphers are practically secure against differ- 
ential and linear cryptanalyses if the upper bounds of the maximum differential 
and linear characteristic probabilities are less than the security threshold. 

It is obvious that the upper bounds of the maximum differential and linear 
characteristic probabilities by method (c) lie between the upper bounds of the 
maximum average of differential and linear hull probabilities by method (a) and 
the maximum differential and linear characteristic probabilities by method (b). 
Moreover, for most ciphers, the maximum averages of differential and linear hull 
probabilities, which provide the actual invulnerability to differential and linear 
cryptanalyses, are much lower than the upper bounds of these probabilities if 
the number of rounds increases. Therefore, it is worth investigating the upper 
bounds of the maximum differential and linear characteristic probabilities. 



^ Aoki and Ohta showed that these probabilities are bounded by if the round 

function is bijective and r > 3 [3] 
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Knudsen discussed the upper bounds of the maximum differential and lin- 
ear characteristic probabilities of general Feistel ciphers [10]. He showed that 
the upper bounds of these probabilities for 2r-round Feistel ciphers are p’’, if 
p, q are the maximum differential and linear probabilities of the round function, 
respectively^. His evaluation, unfortunately, did not take the interrelation be- 
tween input and output data in consecutive rounds into consideration. That is, 
it is not always useful to evaluate the upper bounds of the maximum differential 
and linear characteristic probabilities, if the maximum differential and linear 
probabilities of the round function p, q are relatively high while those of some 
consecutive rounds are (sufficiently) low, such as DES [7]. 

On the other hand, in this paper, we would like to focus attention on the 
upper bounds of the maximum differential and linear characteristic probabilities 
for Feistel ciphers with SPN round function. Like SPN ciphers, Feistel ciphers 
with SPN round function only consist of s-boxes and bitwise exclusive-ORs. This 
means that the (lower bound of) minimum number of active s-boxes determines 
the upper bounds of the maximum differential and linear characteristic probabil- 
ities for not only SPN ciphers but also Feistel ciphers with SPN round function. 
This evaluation takes the interrelation between input and output data in some 
consecutive rounds into consideration, while Knudsen’s evaluation doesn’t. Ac- 
cordingly, our motivation is to clarify the (lower bound of) minimum number of 
active s-boxes in some consecutive rounds of Feistel ciphers. 

This paper is organized as follows. Section 2 introduces some notations and 
definitions. Previous works are shown in Sect. 3. In Sect. 4 and Sect. 5, the lower 
bounds of the minimum number of active s-boxes for differential and linear 
cryptanalyses are given, respectively, i.e., the upper bounds of the maximum 
differential and linear characteristic probabilities. The necessary condition for 
desirable P-functions is discussed in Sect. 6. Finally, we conclude in Sect. 7. 

2 Preliminaries 

2.1 Notations 

X = (xi, . . . , Xn), Xi € (1 < * < u) : 

vector X over GF(2'")” and element Xi of X over GF(2™). 

AX, FY : difference of X and mask value of Y, respectively. 

X ■ rX : parity of bitwise product X and FX. 

X (BY : bitwise exclusive-OR (XOR) . 

X\Y : concatenation between X and Y. 

{S'}, #{S| : elements in set S and the number of elements in set S. 

2.2 Model 

Throughout this paper we consider Feistel ciphers with mn-hit SPN round func- 
tion (See Fig. 1). Note that we neglect the effect of the round key hereafter 

^ Kanda et al. showed that the upper bounds of these probabilities for 3r-round Feistel 
ciphers are if the round function is bijective [11] 
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Fig. 1. SPN round function 



since we assume that the round key, which is used within one round, consists of 
independent and uniformly random bits, and is bitwise XORed with data. 
Notations describe the model as below. 

S'-function is a non-linear transformation layer with n parallel m-bit bijective 
s-boxes. That is, 

S : (^™)" 

JC (xr , . . . , Xji') I y Z (sr(xr),...,Syi (x^)) 

P-function is a linear transformation layer, i.e., 

P : ^ 

Z= {zi,...,Zn) I — >Y = P{Z) = (yi,...,y„) 

Finally, the SPN round function can be described as follows. 

F : {ZZ'^Y (^™)" 

X = (xi, . . . , x„) ^ y = F{X) = P{S{X)) = ( 2 / 1 , . . . , 2 /„) 

Let be the input data to the Tth round function, and be the i-round 
output data. The Feistel cipher is defined as: 

^ ^ y(i) (1 < j < 

where is a plaintext and is a ciphertext. 



2.3 Definitions 

We use the following definitions in this paper. 

Definition 1. For any given Ax, Az, Fx, Fz € ZZ"^ , the differential and linear 
probabilities of each s-box Si are defined as: 

#{x G ^^|si(x) 0 Si(x 0 Ax) = Az} 



DP^'{Ax Az) = 
LP‘^'{Fz -)■ Px) = ( 2 X 



#{xG^^|x-Px = s,(x)-Pz} 



- 1 
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Definition 2. The maximum differential and linear probahilities of s-boxes are 
defined as: 

Ps = max max DP’^'iAx — >■ Az) 

I Ax^O.Az 

<7g = max max LP‘^'(rz^rx) 
i rx,Fz^0 

This means that ps, are the upper bounds of the maximum differential and 
linear probabilities for all s-boxes. 

Definition 3. A differential active s-box is defined as an s-box given a non-zero 
input difference, while a linear active s-box is defined as an s-box given a non- 
zero output mask value [11]. 

Note: When an s-box is bijective, s-boxes given a non-zero output difference 
and a non-zero input mask value are also differential and linear active s-boxes, 
respectively. 

Definition 4. Let X = {x\, . . . ,Xn) G GF(2"*)” then the Hamming weight of 
X is denoted by 

HUX) = 0}. 

This means that the Hamming weight of X equals the number of non- zero m-bit 
characters from GF(2"*) of X . 

3 Previous Works — the Security of SPN Ciphers 

As mentioned above, the security of most SPN ciphers against differential and 
linear cryptanalyses is evaluated using the (lower bound of) minimum number of 
differential and linear active s-boxes, which are a measure of the upper bounds 
of differential and linear characteristic probabilities [19,8,5]. To determine the 
(lower bound of) minimum number of active s-boxes, Rijmen et al. defined the 
branch number B [19]. 

Definition 5. In SPN ciphers, the differential branch number Ba is defined as: 

Bd = min {H^{AX) + i/^(0(Z\A))), 

AX^O 

where AX is an input difference into the diffusion layer and 9{AX) is an output 
difference from the layer. 

Note that AX is also an output difference from a substitution layer and 
6{AX) is also an input difference to the next substitution layer. Since s-boxes 
are bijective, Hy„{AX) equals the number of differential active s-boxes in the 
substitution layer and Hu,{0{AX)) equals that in the next substitution layer. 
That is, if Ud is the minimum number of differential active s-boxes in two con- 
secutive rounds, then Ud = Bd. Thus, it turns out that the minimum number 
of differential active s-boxes in 2r-round SPN ciphers is lower bounded by rBd, 
and the following theorem is obtained. 
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Theorem 1. The maximum differential characteristic probability for 2r-round 
SPN cipher, p^^^\ is upper bounded by 

From the duality between differential characteristics and linear approxima- 
tions [4,14], the following definition and theorem also are established. 

Definition 6. The linear branch number Bi is defined as: 

Bi = min {H,,{e*{rY)) + iF^(TT)), 
ry/o 

where TY is an output mask value of the diffusion layer 6 and 9*{TY) is an input 
mask value of the layer. 9* is the diffusion function of mask values concerning 
the layer. 

Theorem 2. The maximum linear characteristic probability for 2r-round SPN 
cipher, is upper bounded by 

4 Upper Bound of Differential Characteristic Probability 

In this section, we investigate the upper bound of differential characteristic prob- 
ability of Feistel cipher with SPN round function. In the same way as in the 
previous section, our goal is to clarify the (lower bound of) minimum number of 
differential active s-boxes in some consecutive rounds of Feistel cipher. 

First, we show the useful lemma concerning the hamming weight for Feistel 
ciphers. 

Lemma 1. In Feistel ciphers, the following relationship holds. 

H^{AY^^) = 0 AX^^+^'i) < H^{AX^^~^'^) + 

Proof. 

H„,{AY^^^) = H^{AX^^-^'> 0 AX^^+^'>) 

= 0 and = 0} 

+ff{t\Ax^f~^'^ = 0 and Ax^f^^'^ 0} 

+ff{u\Ax‘i~^^ 0 and Ax^,f+^^ 0 and x<f~^'> 

< 0 = 0 and Axf^^'^ 0} 

Q.E.D. 

Since there is a linear transformation layer (P-function) in the SPN round 
function, we will define the differential branch number Vu in the same way as 
in the previous section. Note that it is obvious that if S'-function is bijective 
then Hyj{AX) = Hu,{AZ), since Azi also becomes a non-zero output difference 
through the differential active s^-box. 
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Definition 7. If S -function is bijective, the differential branch number Vd is 
defined as follows. 

Vd = min {H^{AX) + H^{AY)) 

AX^O 

Here, we will define the upper bound of the maximum differential character- 
istic probability of Feistel cipher with SPN round function in the same way as 
used for the SPN cipher. That is, the upper bound of the probability is shown 
by the (lower bound of) minimum number of differential active s-boxes. 

Definition 8. Assume Feistel cipher with SPN round function. Let Hyj{AX^'^'>) 
be the number of the ith-round differential active s-boxes, then the differential 

(r) 

characteristic probability of the r -round Feistel cipher, p)^ , satisfies the following 
relationship. 

(r) , ax(’’+i)) 7 S(o,o,...) Si=i 

Pd — 

From this definition, clarifying the upper bound of the maximum differential 
characteristic probability becomes equivalent to showing the (lower bound of) 
minimum number of differential active s-boxes. To discuss the minimum number 
easily after this, it is denoted as follows. 

r 

= min VH„(Z\XW) 

(AX(o),AX(A^...,AX(’-+i))^(0,0,...) ^ 

Hereafter, because of limitations of space, we assume P-function is bijective. 
Note that this leads to Vd A 2. 

Lemma 2. The minimum number of differential active s-boxes in any three 
consecutive rounds satisfies > 2. 

Proof. If AX^'^'> = 0, then AY^''^ = 0 and AX^'^~^'> = 0. This leads to 

= 2 X H.uj{AX^^~^'i) > 2. On the other hand. If AX^A ^ q, it follows that 
2^2 — Hyj{AX^'^l) H- Pu,(Z\y*^®^) > Vd, since Lemma 1 shows Hu,{AX^'‘~A') -\- 

P^(ziX(*+i)) > P^(Z\F«). 

Q.E.D. 

Lemma 3. The minimum number of differential active s-boxes in any four con- 
secutive rounds satisfies > Vd. 

Proof. Without loss of generality, we assume that the four consecutive rounds 
run from the first round to the fourth round. 

At no time do both input differences into any consecutive two rounds equal 
zero. In addition, by the assumption, at no time also do both input differences 
of every two rounds equal zero. Thus we only consider the six following cases 
concerning input differences into the consecutive four rounds. 

( 1 ) AXA) ^ 0 , AX<^A ^ 0 , 0 , AXA) ^ 0 
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(2) = 0, ziX(2) ^ 0, ^ 0, ^ 0 

(3) 0, ziX(2) = 0, Z\X(3) ^ 0, zixW ^ 0 

(4) 0, ziX(2) ^ 0, Z\X(3) = 0, zixW ^ 0 

(5) 7^ 0, ziX(2) 7 ^ 0, Z\X(3) ^ 0, zixW = 0 

(6) = 0, ziX(2) 7^ 0, Z\X(3) 7^ 0, zixW = 0 

In case (1), by Lemma 2, = vf> + H^{AX^^">) >Vd + H^{AX'^‘^'>) > 

T’d + ^- 

In case (2), AX^^'^ = 0 leads to ^ ^^(3)_ Thus, v''^> = iJ„(Z\X(2)) + 

H^{AY(^^) + iL^(Z\xW) >Vd + H^{AXi^'>) >Vd + l- 
Similarly, in cases (3), (4), and (5), we get > 'Pd + 1- 
In case (6), by Lemma 2, > Vd- 

Q.E.D. 

From the above proof, the following corollary is obtained. 

Corollary 1. The minimum number of differential active s-boxes in any four 
consecutive rounds satisfies 

(i) > Vd, if and only if the input differences in both the first round and the 

fourth round are zero. 

(li) r>(4) > 77 ^ + 1 in the other cases. 

Lemma 4. The minimum number of differential active s-boxes in any six con- 
secutive rounds satisfies V^^'> > Pd + 2. 

Proof. — If AX^^'> fi- 0 and AX^^^> yf 0, by Lemma 2, + P® > 

2 X Vd- 

- If ziX(2) = ziX(5) = 0, we get ZCfP) = AX^^^ and ZiF^^) ^ ^ 

Z\X( 6 ). Thus, = 2 X (iL^(ZFf(3)) + i7^(Z\XW)) = 2 x {H^{AX^^'>) + 
iJ„(ziF(3))) > 2 X Pd 

~ If Z\X(2) = 0 and Z\X(5) 7 ^ 0, or AX^'^1 yf 0 and AX^^'’ = 0, then vf'’ = 
P® + > Pd + 2 by Lemma 2. 

Q.E.D. 

Lemma 5. The minimum number of differential active s-boxes in any eight 
consecutive rounds satisfies > 2 x Pd + I. 

Proof. Again, corollary 1 shows that, in any four consecutive rounds, the mini- 
mum number of differential active s-boxes satisfies (i) > Vd, if and only if 

the input differences in both the first round and the fourth round are zero, and 
T>(4) > Pjj -|- 1 in the other cases. 

Since there is no case in which both input differences into any two consecutive 
rounds are zero at the same time, the input differences in both the fourth and 
fifth rounds cannot be zero. That is, the eight consecutive rounds cannot be 
divided into two cases (i). Thus, P*^®) > Pd + (Pd + 1) > 2 x Pd -I- 1. 



Q.E.D. 
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Lemma 6. The minimum number of differential aetive s-boxes in any twelve 
eonsecutive rounds satisfies > 3 x + 1- 

Proof. can be converted to three expressions, i.e., 4 x T>^^\ 2 x T>^^\ and 

2?(8) _|_2?(4)^ Since satisfies the three evaluations at the same time, = 

max{4 X V^^\2 x p(s) p(4)| > ^(s) > 3 x 2^^ + 1. 

Q.E.D. 

From the proofs of above-mentioned lemmas, the useful theorem for the dr- 
round Feistel ciphers is established as follows. 

Theorem 3. The minimum number of differential active s-boxes for 4r- 

round Feistel ciphers with SPN round function satisfies >rxVd+\r/2\. 

Knudsen argued that for a Feistel cipher to be practically secure against dif- 
ferential and linear cryptanalyses, the upper bounds of the maximum differential 
and linear characteristic probabilities must be less than the security threshold. 
Generally speaking, the security threshold is equated to the inverse of the num- 
ber of all plaintext blocks, i.e., 2“®^ for 64-bit ciphers and 2“^^® for 128-bit 
ciphers. 

For example, let the maximum differential probability of an 8-bit s-box be 
Ps = 2“® and the differential branch number be Vd = 5. It follows that 18-round 
Feistel ciphers, such as Camellia [1], are practically secure against differential 
cryptanalysis because of the following corollary. 

Corollary 2. Assuming that the round function consists of s-boxes yielding the 
maximum differential probability ps = 2“® and P-function yielding the differ- 
ential branch number Vd = 5, then a 128-bit Feistel cipher with more than 
16-rounds has no effective differential characteristic. 

Proof. By Definition 8 and Theorem 3, = 2“^®^ < 2“^^®. 

Q.E.D. 



5 Upper Bound of Linear Characteristic Probability 

In this section, the upper bound of linear characteristic probability is derived 
in the same way as in the previous section. That is, our goal is to clarify the 
(lower bound of) minimum number of linear active s-boxes in some consecutive 
rounds of Feistel cipher using the duality of differential characteristic and linear 
approximation. 

First, the following theorem is established. 

Theorem 4. Consider a Feistel cipher with SPN round function. If the linear 
transformation layer P (P-function) is bijective, the cipher can be transformed 
into a Feistel cipher with the PSN round function. 
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Proof. From the assumption that P-function is bijective, let describe P{Z) as 
the transformation of Z by the P-function, and P~^{Z) as that by the inverse 
function of the P-function. 

As mentioned above, in a Feistel cipher with SPN round function, the equa- 
tion, ©P(S'(A(*))), is satisfied. Now, let = p-i(X(*)). The 

above equation can be transformed as follows, since C = A (B P{B) C = 
P(P“^(A) © B) for any (A, P, C). 

^ ^ p(5'(xW)) = p(p-i(x(*-i)) © s'(aW)) 

^ P(yb+ 1 )) = P(yb-i) 0 S’(p(pW))) 

^ y{i+i) = y{i-i) 0 S{P{V^^)) 

The equation, 0 S{P{V^''^)), denotes a Feistel cipher with 

the PSN round function. Accordingly, the ciphertext obtained by 

applying a Feistel cipher with SPN round function to a plaintext 
is equivalent to the result of changing the plaintext to 

by the P“^-function first, then getting t/(r-i-i)) from from the 

Feistel cipher with PSN round function, and finally transforming it into the 
ciphertext by the P-function. 

Q.E.D. 

Starting with the duality between differential characteristic and linear ap- 
proximation, we will define the linear branch number P/, which is similar to the 
differential branch number Vd- Hereafter, we assume P-function is bijective. 

Definition 9. The linear branch number Vi is defined as: 

Vi= min(P^(P*(Pr))+P^(PF))= min (P^(PZ) + P^(Py)), 

where PY, PZ is an output mask value and an input mask value of the P- 
function, respectively, and P* is a diffusion function of mask values concerning 
the P-function. 

Next, we will define the upper bound of the linear characteristic probability 
of a Feistel cipher with SPN round function. That is, the upper bound of the 
probability is shown by the (lower bound of) minimum number of linear active 
s-boxes. 

Definition 10. Assume a Feistel cipher with SPN round function. If P[,„{PZ^^^) 
is the number of the ith-round linear active s-boxes, then the linear characteristic 
probability of the r-round Feistel cipher satisfies the following relationship. 

(r) , ™“(rv(o).....rvM,rv('-+i))7S(...,o,o) SLi 

Pi 2^ Ps , 

where PZ^^ = P*(py(d) and P* is the diffusion function of mask values con- 
cerning the P-function. 
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From this definition, clarifying the upper bound of the linear characteristic 
probability becomes equivalent to determining the (lower bound of) minimum 
number of linear active s-boxes. To discuss the minimum number easily after 
this, we denote it as follows. 

r 

min 

0.0) ^ 

Theorem 5. Assume a Feistel cipher with SPN round function. If both S- 
function and P-function are bijective, then and Vi also satisfy Lemma 2 
to Lemma 6 and Theorem 3. 

Proof. Because of the bijective P-function, a Feistel cipher with SPN round 
function is transformed into one with PSN round function by Theorem 4. The 
cipher can be described as: 

pb+i) = yb-i) © 5(P(PW)) = 0 

where = p-i(X«), ZW = S'(xW). 

From the duality between differential characteristic and linear approximation, 
the linear approximation of the round function of the transformed cipher can be 
expressed as follows using the concatenation rules [4,14]. 

pp(*) = © PZ(*+^) = P*{PX^^'>) 

By the way, since S'-function is bijective, H.uj{rX) = Hyj{rZ) because Pxi 
is a non-zero input mask value of a linear active s^-box. Therefore, the linear 
branch number P; is redefined as: 

n = min {H^{P*{rX)) + H^{PX)) = min {H^{PV) + H^{PZ)) 
rx^o rz^o 

Accordingly, if AX^'^'> and are exchanged for PZ^'^'> and PP(*\ respec- 

tively, it turns out that all proofs are satisfied in the same way as for Lemma 2 
to Lemma 6 and Theorem 3. 

Q.E.D. 



For example, the P*-function of Camellia can be expressed as: 
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Thus, it is easily seen that Vi = 5, and the following corollary is obtained. 
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Corollary 3. Camellia with reduced to 16-rounds (without FL- and FL 
functions) has no effective linear approximation. 

Proof. The maximum linear probability of Camellia’s s-boxes is Qs = 2“®. From 
Theorem 5 and Vi = 5, the maximum linear characteristic probability of Camel- 
lia with reduced to 16-rounds is also upper bounded by 

Q.E.D. 



6 Necessary Condition for Desirable P-Functions 



In this section, we consider the necessary condition for desirable P-functions. 
Here, “desirable” means that the round functions are invulnerable to linear 
cryptanalysis as well as differential cryptanalysis. 

Obviously, the condition is Pd = Pi from Sect. 4 and Sect. 5. Thus, we 
investigate P-functions wherein Pd = Pi- 



Theorem 6. Assume that P -function is bijective and is expressed as an n x 
n matrix P over GF(2)’". When the P-function satisfies [piY = the 

following relations are satisfied. 



[^ViY = > [rziY = [p^jYiryjf = [pji][ryjY, 



where [xi\ denotes the vector (or matrix) of X and [xif denotes the transposed 
vector (or matrix) of X . 

Proof First, since pi = • Zj), 



^y^ = yi®y'i= • zfi 0 0(py • z)) 









= 0(Pb • ® Pij ■ 



i=i 



= 0(Pb • ® Zj)) = 0(Pb ■ ^Zj) 

i=i j=i 

Thus, [Apif = [pij][AzjY is satisfied. 

Second, since the P-function is bijective, Z ■ FZ = Y ■ FY. Then, 

n / / n \ \ n / n N 

y . PF = 0 0(pji • = 0 0(Pi* • • ryj) 

j^l \ \i=l / / \i=l / 



n / n 



0 0((Pii ■^yj)-zi)] =0 0(pp ■ ^vj) 



■ Zi 



i=l \i=l 



\ \i = l 
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On the other hand, since Z ■ FZ = 0”^i {zi ■ Fzi), it is obvious that F Zi = 
0j=ife* ■ ryj) = • Fyj). Thus, [F z^f = [pj^][FyjY = is 

satisfied. 

Q.E.D. 



Theorem 7. Assume that F is a set of matrices that consist of only one F 
element and (m— 1) 0-elements in each line and row, i.e., the matrices generated 
by only interchanging lines and/or rows of unit matrix. 

If a bijective P -function can be expressed as an nx n matrix P over GF(2)™ 
such that ■ P G F or P* = I 2 ■ P ■ I\ where I\,l 2 G I', then the P-function 
satisfies Vd = Pi- 

Proof. By Theorem 6, if the P-function can be expressed as an n x n matrix P 
over GF(2)™, then 

Pd = min (H^iAZ) + H,,{P{AZ))), Pi = min {H,,{P\FY)) + H,,{PY)) 



(i) In the case of P* • p = /* g let FY = P{FW). 

Since the P-function is bijective, it is guaranteed that {PT} = {PIT}. Thus, 
Pi = min (i/^(P‘(P(PIT))) + P™(P(PIT))) 

= min • PIT)) + i7^(P(PIT))). 



Here, because I* G F , I* ■ FW leads to another vector simply by interchanging 
the elements of PIT. Thus, i7u,(PIT) = ■ FW). As a result, 3{AZ,FW), 

s.t. Pd = Pl. 

(ii) In the case of P* = l2-P-Ii where Ii,l 2 G as mentioned above, since Ii 
and I 2 lead to another vector simply by interchanging the elements, Hu,{AX) = 
H„,{h{AX)) and i7^(/2(PIT)) = i7^(PIT). Now, let Z\Z = h{AX). Since h 
is bijective, it is guaranteed that {AX} = {AZ}. Thus, 



Pd 



On the other hand. 



min 

AX^O 



(P^(/i(ziA)) + P^(P./i(AA))) 



min 

AX^O 



(P^(Z\X)+i7^(P-/i(ziX))). 



Pi = mm (i7^(/2 • P • h(FY)) + H^rY)) 

= mm • Ii(PY)) + i7„(PT)). 

As a result, B(AX, FY), s.t. Pd = Pi. 

Q.F.D. 

For example, the relationship between P-function and P*-function of Gamel- 
lia is shown as follows. Thus Theorem 7 indicates that the P-function of Gamellia 
is “desirable.” 



^Camellia ^Camellia ^ ’ ^Camellia ' I 
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7 Conclusion 

This paper studied the upper bounds of the maximum differential and linear 
characteristic probabilities of Feistel ciphers with SPN round function. In the 
same way as for SPN ciphers, we considered the minimum number of differ- 
ential and linear active s-boxes, which are a measure of the upper bounds of 
these probabilities, in order to evaluate security against differential and linear 
cryptanalyses. The advantage of this method is that it considers the interre- 
lation between input and output data in consecutive rounds, unlike Knudsen’s 
estimation. 

We focused on the minimum number of active s-boxes in some consecutive 
rounds of Feistel ciphers, i.e., in three, four, six, eight, and twelve consecutive 
rounds, since they can determine the upper bounds of the maximum differential 
and linear probabilities using the differential and linear branch numbers Vd, 'Pi-, 
respectively. These numbers provide the avalanche effects of P-functions with 
regard to differential and linear characteristics. As a result, we clarified that 
the lower bounds of the minimum number of differential (resp. linear) active 
s-boxes are 2, Pd (7*;), Pd + 2 {Pi 2) , 2Pd 1 {2Pi -1-1), and 3Pd -I- 1 (3P/ + l)i 
respectively. The interesting result is that the lower bound of the minimum 
number of active s-boxes is proportional to the branch number every fourth 
round, while it seems to be every third round at first glance. Furthermore, this 
means that, if the branch number is the same, a 2r-round Feistel cipher has 
almost same invulnerability to differential and linear cryptanalyses as a r-round 
SPN cipher in terms of the upper bounds of the maximum differential and linear 
probabilities. 

Finally, we investigated the necessary condition for desirable P-functions, 
which means that the round functions are invulnerable to both differential and 
linear cryptanalyses. In addition, we showed the example of the round function 
of Camellia, which satisfies the condition. 
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